Website: https://jdeschena.github.io
Make sure to check it out to learn why training with PPO for too long makes your agent collapse!
Jiaheng Hu of UTexas on Unsupervised Skill Discovery for HRL
@skandermoalla.bsky.social of EPFL: Representation and Trust in PPO
Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs
Make sure to check it out to learn why training with PPO for too long makes your agent collapse!
We have two accepted papers from my lab:
1. Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers, on Wednesday, East Exhibit Hall A-C #2010 (1/3)
We have two accepted papers from my lab:
1. Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers, on Wednesday, East Exhibit Hall A-C #2010 (1/3)
Among "oldies but goldies", this tutorial by Rabiner on Hidden Markov Models (HMMs) is dear to my heart. HMMs are one of the simplest statistical models where some variables are not observed, and we love them for it. 🧵
www.cs.ubc.ca/~murphyk/Bay...
Among "oldies but goldies", this tutorial by Rabiner on Hidden Markov Models (HMMs) is dear to my heart. HMMs are one of the simplest statistical models where some variables are not observed, and we love them for it. 🧵
www.cs.ubc.ca/~murphyk/Bay...
I've noticed not all accounts seem to be eligible to be added, anyone know what's up with that? 🤔
I've noticed not all accounts seem to be eligible to be added, anyone know what's up with that? 🤔