ML Research Template (https://github.com/CLAIRE-Labo/python-ml-research-template)
Trained on 15T tokens in 1,000+ languages, it’s built for transparency, responsibility & the public good.
Read more: actu.epfl.ch/news/apertus...
❌ You want rewards, but GRPO only works online?
❌ You want offline, but DPO is limited to preferences?
✅ QRPO can do both!
🧵Here's how we do it:
❌ You want rewards, but GRPO only works online?
❌ You want offline, but DPO is limited to preferences?
✅ QRPO can do both!
🧵Here's how we do it:
She will be joining University of Zurich as a professor this summer, and hiring PhD students and postdocs. You should apply to her group!
Her website: koloskova.github.io
She will be joining University of Zurich as a professor this summer, and hiring PhD students and postdocs. You should apply to her group!
Her website: koloskova.github.io
Make sure to check it out to learn why training with PPO for too long makes your agent collapse!
Jiaheng Hu of UTexas on Unsupervised Skill Discovery for HRL
@skandermoalla.bsky.social of EPFL: Representation and Trust in PPO
Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs
Make sure to check it out to learn why training with PPO for too long makes your agent collapse!
This will be the official account of the Eastern European Machine Learning (EEML) community.
Follow us for news regarding our summer schools, workshops, education/community initiatives, and more!
@caglarai.bsky.social
🧑💻 github.com/CLAIRE-Labo/...
@caglarai.bsky.social
🧑💻 github.com/CLAIRE-Labo/...
Wed 11 Dec 11 am - 2 pm PST
West Ballroom A-D #6403
@caglarai.bsky.social @andreamiele.bsky.social @razvan-pascanu.bsky.social
Wed 11 Dec 11 am - 2 pm PST
West Ballroom A-D #6403
@caglarai.bsky.social @andreamiele.bsky.social @razvan-pascanu.bsky.social