Anastasiia Pedan
pedanana.bsky.social
Anastasiia Pedan
@pedanana.bsky.social
my main takeaway from a talk on reward design in rl: ai only beat humans when they were asked not to collaborate 👀👀
August 25, 2025 at 7:07 PM
Would you be surprised to learn that many empirical implementations of value-aware model learning (VAML) algos, including MuZero, lead to incorrect model & value functions when training stochastic models 🤕? In our new @icmlconf.bsky.social 2025 paper, we show why this happens and how to fix it 🦾!
June 19, 2025 at 2:40 AM