If I seem very angry, check if I have been watered in the last 24 hours.
Now 🇺🇸 flavoured, previously available in 🇨🇦 and 🇩🇪
I'm way too political and loud in general, so please be warned.
If you are interested in reinforcement learning, sample-efficiency, compute-efficiency go check it out. See you in Rio!
✅ ~4.5× fewer parameters than SimbaV2
✅ Scales to vision-based RL
👉 arxiv.org/pdf/2509.25174
Thanks to Florian Vogt @joemwatson.bsky.social @jan-peters.bsky.social
If you are interested in reinforcement learning, sample-efficiency, compute-efficiency go check it out. See you in Rio!
Also, why do people write "RL doesn't work" papers so passionately?
Paper: Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening
( www.arxiv.org/abs/2601.21590 )
Also, why do people write "RL doesn't work" papers so passionately?
Yep, it's that same old HLE. They have submitted the paper 07 May 2025. And no, I don't know what the point of publishing it like that is either. Looks good on CVs, I guess.
Yep, it's that same old HLE. They have submitted the paper 07 May 2025. And no, I don't know what the point of publishing it like that is either. Looks good on CVs, I guess.
✅ Institutional impact from making an LLM library installable on our hell-hole of a cluster? 🏆
✅ Institutional impact from making an LLM library installable on our hell-hole of a cluster? 🏆
@ericeaton.bsky.social @mkearnsphilly.bsky.social @aaroth.bsky.social @sikatasengupta.bsky.social @optimistsinc.bsky.social
📖 Replicable Reinforcement Learning with Linear Function Approximation
🔗 arxiv.org/abs/2509.08660
In this paper, we study formal replicability in RL with linear function approximation. The... (1/6)
@ericeaton.bsky.social @mkearnsphilly.bsky.social @aaroth.bsky.social @sikatasengupta.bsky.social @optimistsinc.bsky.social
😮 Want to have stable on-policy RL without filling your GPU with an enormous replay buffer? 😮
🤖 Are you a roboticist and just want your RL code to run? 🤖
🎉 Fear not, we started adding new REPPO versions! 🎉
github.com/cvoelcker/rs...
@pranav-nlp.bsky.social presented "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing.
Paper: arxiv.org/abs/2505.08127
#NLProc
@pranav-nlp.bsky.social presented "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing.
Paper: arxiv.org/abs/2505.08127
#NLProc
fortune.com/2026/01/21/n...