Faisal Mohamed
thefirstfaisal.bsky.social
Faisal Mohamed
@thefirstfaisal.bsky.social
MSc at Mila, Reinforcement learning, representation learning and probabilistic inference.
Reposted by Faisal Mohamed
Training #deepRL agents has always been a tricky and unstable process. What is the cause of these instabilities? We study the coupling effects of policy training and value estimation and find a chain effect of the value and policy churn in popular DRL agents.
December 11, 2024 at 5:34 PM