Pietro Novelli
pienovelli.bsky.social
Pietro Novelli
@pienovelli.bsky.social
Physicist, working on machine learning for dynamical systems | reinforcement learning | machine learning for science | transfer learning for atomistic potentials | statistical learning theory & optimization.
By the time I finished working on this paper, I had more research questions than when I started. I take this fertility of ideas as a very good sign 😃. If you’re in Vancouver, consider checking it out. I’ll be at the West Ballroom A-D from 16:30 to 19:30, poster #6907
December 12, 2024 at 5:19 PM
To add some flesh around this core idea, we developed a neat theoretical foundation that combines conditional mean embeddings and policy mirror descent. This foundation ultimately leads to sample complexity results, highlighting the interplay between exploration and exploitation.
December 12, 2024 at 5:19 PM
The return is a (conditional) expected value, and we realized that there are now mature ML tools to model such expected values directly, avoiding the solution of intermediate and more difficult problems.
December 12, 2024 at 5:19 PM
So, what’s all this fuss about? Reinforcement learning, in essence, is an optimization problem: we want to maximize returns.
December 12, 2024 at 5:19 PM
This quote neatly encapsulates the core of our “Operator World Models for Reinforcement Learning” which we’re presenting today at @NeurIPS. arxiv.org/abs/2406.19861
Operator World Models for Reinforcement Learning
Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessi...
arxiv.org
December 12, 2024 at 5:19 PM