RL Theory Lecture Notes: https://arxiv.org/abs/2312.16730
(shameless repost of my pinned tweet)
New preprint where we look at the mechanisms through which next-token prediction produces models that succeed at downstream tasks.
The answer involves a metric we call the "coverage profile", not cross-entropy.
New preprint where we look at the mechanisms through which next-token prediction produces models that succeed at downstream tasks.
The answer involves a metric we call the "coverage profile", not cross-entropy.
academicjobsonline.org/ajo/jobs/30865
BTW-
Not quite ready for a postdoc? We updated the TCS Masters programs spreadsheet:
www.cs.princeton.edu/~smattw/mast...
Any career stage and in the (SF) Bay Area?
Save the date for TOCA-SV on 11/7!
academicjobsonline.org/ajo/jobs/30865
BTW-
Not quite ready for a postdoc? We updated the TCS Masters programs spreadsheet:
www.cs.princeton.edu/~smattw/mast...
Any career stage and in the (SF) Bay Area?
Save the date for TOCA-SV on 11/7!
A totally new framework based on ~backtracking~ for using process verifiers to guide inference, w/ connections to approximate counting/sampling in theoretical CS.
Paper: www.arxiv.org/abs/2510.03149
A totally new framework based on ~backtracking~ for using process verifiers to guide inference, w/ connections to approximate counting/sampling in theoretical CS.
Paper: www.arxiv.org/abs/2510.03149
Apply here: jobs.careers.microsoft.com/global/en/jo...
Apply here: jobs.careers.microsoft.com/global/en/jo...
We're hiring postdocs and senior researchers in AI/ML broadly, and in specific areas like test-time scaling and science of DL. Postdoc applications due Oct 22, 2025. Senior researcher applications considered on a rolling basis.
Links to apply: aka.ms/msrnyc-jobs
We're hiring postdocs and senior researchers in AI/ML broadly, and in specific areas like test-time scaling and science of DL. Postdoc applications due Oct 22, 2025. Senior researcher applications considered on a rolling basis.
Links to apply: aka.ms/msrnyc-jobs
These are positions for up to 2 years, starting in July 2026.
Application deadline: October 22, 2025
These are positions for up to 2 years, starting in July 2026.
Application deadline: October 22, 2025
📝Soliciting abstracts that advance foundational understanding of reasoning in language models, from theoretical analyses to rigorous empirical studies.
📆 Deadline: Sept 3, 2025
📝Soliciting abstracts that advance foundational understanding of reasoning in language models, from theoretical analyses to rigorous empirical studies.
📆 Deadline: Sept 3, 2025
📝Soliciting abstracts that advance foundational understanding of reasoning in language models, from theoretical analyses to rigorous empirical studies.
📆 Deadline: Sept 3, 2025
I have experience with the governance of TMLR, COLT, and ALT, and I think I've demonstrated myself as a consciencious and engaged community member.
I have experience with the governance of TMLR, COLT, and ALT, and I think I've demonstrated myself as a consciencious and engaged community member.
If you're doing RL in sim, why not use the sim to its full potential? Reset to any state! (gym.Env.reset() is not all we need.)
PDF: arxiv.org/abs/2404.15417
If you're doing RL in sim, why not use the sim to its full potential? Reset to any state! (gym.Env.reset() is not all we need.)
PDF: arxiv.org/abs/2404.15417
📅When: Mon, June 30 | 16:00 CET
What: Fireside chat w/ Peter Bartlett & Vitaly Feldman on communicating a research agenda, followed by mentorship roundtable to practice elevator pitches & mingle w/ COLT community!
let-all.com/colt25.html
📅When: Mon, June 30 | 16:00 CET
What: Fireside chat w/ Peter Bartlett & Vitaly Feldman on communicating a research agenda, followed by mentorship roundtable to practice elevator pitches & mingle w/ COLT community!
let-all.com/colt25.html
Full posting to come in a bit.
Full posting to come in a bit.
arxiv.org/abs/2503.14337
arxiv.org/abs/2503.14337
Submission website: openreview.net/group?id=lea...
📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
🗓️ Deadline: May 19, 2025
Submission website: openreview.net/group?id=lea...
📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
🗓️ Deadline: May 19, 2025
📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
🗓️ Deadline: May 19, 2025
📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
🗓️ Deadline: May 19, 2025
📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
🗓️ Deadline: May 19, 2025
New paper (appearing at ICML) led by the amazing Audrey Huang (ahahaudrey.bsky.social) with Adam Block, Qinghua Liu, Nan Jiang, and Akshay Krishnamurthy (akshaykr.bsky.social).
1/11
New paper (appearing at ICML) led by the amazing Audrey Huang (ahahaudrey.bsky.social) with Adam Block, Qinghua Liu, Nan Jiang, and Akshay Krishnamurthy (akshaykr.bsky.social).
1/11
04/29: Max Simchowitz (CMU)
05/06: Jeongyeol Kwon (Univ. of Widsconsin-Madison)
05/20: Sikata Sengupta & Marcel Hussing (Univ. of Pennsylvania)
05/27: Dhruv Rohatgi (MIT)
06/03: David Janz (Univ. of Oxford)
06/10: Nneka Okolo (MIT)
04/29: Max Simchowitz (CMU)
05/06: Jeongyeol Kwon (Univ. of Widsconsin-Madison)
05/20: Sikata Sengupta & Marcel Hussing (Univ. of Pennsylvania)
05/27: Dhruv Rohatgi (MIT)
06/03: David Janz (Univ. of Oxford)
06/10: Nneka Okolo (MIT)
Join us to discuss this at our exciting workshop at @icmlconf.bsky.social 2025: EXAIT!
exait-workshop.github.io
#ICML2025
Join us to discuss this at our exciting workshop at @icmlconf.bsky.social 2025: EXAIT!
exait-workshop.github.io
#ICML2025
A new paper with Zak Mhammedi and Dhruv Rohatgi:
The Computational Role of the Base Model in Exploration
arxiv.org/abs/2503.07453
A new paper with Zak Mhammedi and Dhruv Rohatgi:
The Computational Role of the Base Model in Exploration
arxiv.org/abs/2503.07453
Tuesday March 25, 6 PM UTC.
Tuesday March 25, 6 PM UTC.