Professor of Computer Science. Machine Intelligence Lab, UCL AI Centre, Department of Computer Science, University College London.
Artificial Intelligence | Machine Learning | Computing | Books
https://www.mircomusolesi.org ..
more
Professor of Computer Science. Machine Intelligence Lab, UCL AI Centre, Department of Computer Science, University College London.
Artificial Intelligence | Machine Learning | Computing | Books
https://www.mircomusolesi.org
Reposted by Mirco Musolesi, Robert J. Geller, Daniel W. Drezner
Reposted by Mirco Musolesi, Jack Stilgoe
.
We replace the entropy bonus in PPO with a *complexity* bonus, encouraging structured and stochastic policies that are robust to different scaling factors and can work in environments with variable exploration needs.
Read more:
arxiv.org/abs/2509.20509
w/ @mircomusolesi.bsky.social