External lecturer at Johannes Kepler University Linz, Austria
Drug Discovery | Deep Learning | RL
http://www.arjonamedina.com
To me it looks like a clever way of applaying a PPO-like clipping within a supervised framework, constrained by a fixed reference model. Althought some parts in its formulation are very similar to PPO, I wouldn't describe it as RL. (1/5)🧵
To me it looks like a clever way of applaying a PPO-like clipping within a supervised framework, constrained by a fixed reference model. Althought some parts in its formulation are very similar to PPO, I wouldn't describe it as RL. (1/5)🧵
Once you know the root of the problem, you can find nice solutions 😉
In our case, a very simple regularization term did the job.
Once you know the root of the problem, you can find nice solutions 😉
In our case, a very simple regularization term did the job.
go.bsky.app/AcP9Lix
go.bsky.app/AcP9Lix
➕We are just getting started, please send me others who should be added
go.bsky.app/JeFdryY #ML
➕We are just getting started, please send me others who should be added
go.bsky.app/JeFdryY #ML