. CIFAR AI Chair, RL_Conference chair. Creating generalist problem-solving agents for the real world. He/him/il.
world-model-mila.github.io
world-model-mila.github.io
rl-conference.cc
March 1: Abstract DL (AoE)
March 5: Submission DL (AoE)
Conference: Montreal, Quebec, Canada,
August 16th -19th, 2026.
rl-conference.cc
March 1: Abstract DL (AoE)
March 5: Submission DL (AoE)
Conference: Montreal, Quebec, Canada,
August 16th -19th, 2026.
RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!
Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)
Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...
Please share widely!
When: 11:30-noon
Where: Level Room 30A-E
When: 11:30-noon
Where: Level Room 30A-E
Applicant names, profiles, demographics
Reviewers names, profiles, comments, and scores
Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).
1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism
1/X
1) It does not train a critic (no need with small variance)
2) The SCORE FUNCTION (difficult to call this an advantage) is over a batch using the same initial prompt (similar to the vine sample method from TRPO)
1) It does not train a critic (no need with small variance)
2) The SCORE FUNCTION (difficult to call this an advantage) is over a batch using the same initial prompt (similar to the vine sample method from TRPO)
cloud.google.com/blog/product...