https://araffin.github.io/
Types of Reinforcement Learning Paper
Original image: @xkcd.com
Small errors in the code poisons results in ways that may not be visibly obvious.
LLMs are great when people verify outputs; the path to hell is when they don't.
Small errors in the code poisons results in ways that may not be visibly obvious.
LLMs are great when people verify outputs; the path to hell is when they don't.
RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!
Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)
Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...
Please share widely!
RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!
Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)
Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...
Please share widely!
We designed this monograph to be self-contained, covering: Grid, Random & Quasi-random search, Bayesian & Multi-fidelity optimization, Gradient-based methods, Meta-learning.
arxiv.org/abs/2410.22854
We designed this monograph to be self-contained, covering: Grid, Random & Quasi-random search, Bayesian & Multi-fidelity optimization, Gradient-based methods, Meta-learning.
arxiv.org/abs/2410.22854
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
I highly recommend watching it, regardless of whether you're interested in UX.
1. How Desktop UX is effectively dead
2. Why I hate the term UX/UI with the heat of 1000 suns
3. How OSS can actually innovate in #ux
www.youtube.com/watch?v=1fZT...
I highly recommend watching it, regardless of whether you're interested in UX.
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
Word-level diffing just landed. 🎉
It's been a night-and-day difference for us—seeing exactly what changed within each line.
Word-level diffing just landed. 🎉
It's been a night-and-day difference for us—seeing exactly what changed within each line.
Demo: jonathancoletti.github.io/CarDodgingGym/
Documentation: stable-baselines3.readthedocs.io/en/master/gu...
Demo: jonathancoletti.github.io/CarDodgingGym/
Documentation: stable-baselines3.readthedocs.io/en/master/gu...
We have an amazing lineup of speakers: @Mathieugeist, @gio_ramponi, Theresa Eimer, @SarahKeren_, @araffin2, @c_rothkopf, and @AdrienBolland
⏰ Friday 6th February
📍University of Mannheim
We have an amazing lineup of speakers: @Mathieugeist, @gio_ramponi, Theresa Eimer, @SarahKeren_, @araffin2, @c_rothkopf, and @AdrienBolland
⏰ Friday 6th February
📍University of Mannheim
Workshop on Reinforcement Learning 2026, taking place on 𝐅𝐞𝐛𝐫𝐮𝐚𝐫𝐲 𝟔, 𝟐𝟎𝟐𝟔, at the 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐨𝐟 𝐌𝐚𝐧𝐧𝐡𝐞𝐢𝐦, Germany.
Participation in the workshop is 𝐟𝐫𝐞𝐞 𝐨𝐟 𝐜𝐡𝐚𝐫𝐠𝐞!
Check the program and register: www.wim.uni-mannheim.de/doering/conf...
Workshop on Reinforcement Learning 2026, taking place on 𝐅𝐞𝐛𝐫𝐮𝐚𝐫𝐲 𝟔, 𝟐𝟎𝟐𝟔, at the 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐨𝐟 𝐌𝐚𝐧𝐧𝐡𝐞𝐢𝐦, Germany.
Participation in the workshop is 𝐟𝐫𝐞𝐞 𝐨𝐟 𝐜𝐡𝐚𝐫𝐠𝐞!
Check the program and register: www.wim.uni-mannheim.de/doering/conf...
Thanks to Paul Vicol (@paulvicol.bsky.social) for his tireless work on this new option, as well as the OpenReview team.
🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.
🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
Thanks to Paul Vicol (@paulvicol.bsky.social) for his tireless work on this new option, as well as the OpenReview team.
This may be a warning to lots of humanoids companies. All your promises don’t matter to the public if your robot looks or acts dumb.
youtu.be/b_SNExtznd4?...
This may be a warning to lots of humanoids companies. All your promises don’t matter to the public if your robot looks or acts dumb.
youtu.be/b_SNExtznd4?...
michaelbastos.com/blog/why-sel...
#programming #softwaredevelopment #tech #blog
michaelbastos.com/blog/why-sel...
#programming #softwaredevelopment #tech #blog
Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).
1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism
1/X
Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).
1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism
1/X
Modern package management for Robotics with Pixi!
prefix.dev/blog/reprod...
#ROS #ROSCon #ROSCon2025
Modern package management for Robotics with Pixi!
prefix.dev/blog/reprod...
#ROS #ROSCon #ROSCon2025
link: www.tylervigen.com/spurious-cor...
found via @stefanjudis.com newsletter
link: www.tylervigen.com/spurious-cor...
found via @stefanjudis.com newsletter
Day 1: www.youtube.com/watch?v=Use5...
Day 2: www.youtube.com/watch?v=rh2o...
Day 3: www.youtube.com/watch?v=9lzF...
Day 1: www.youtube.com/watch?v=Use5...
Day 2: www.youtube.com/watch?v=rh2o...
Day 3: www.youtube.com/watch?v=9lzF...
In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.
In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.
I've been having a lot of fun animating a mini-series about this topic, and the main part is now out.
youtu.be/j0wJBEZdwLs
I've been having a lot of fun animating a mini-series about this topic, and the main part is now out.
youtu.be/j0wJBEZdwLs