Ryan Sullivan
banner
ryanpsullivan.bsky.social
Ryan Sullivan
@ryanpsullivan.bsky.social
PhD Candidate at the University of Maryland researching reinforcement learning and autocurricula in complex, open-ended environments.

Previously RL intern @ SonyAI, RLHF intern @ Google Research, and RL intern @ Amazon Science
Reposted by Ryan Sullivan
"As researchers, we tend to publish only positive results, but I think a lot of valuable insights are lost in our unpublished failures."

New blog post: Getting SAC to Work on a Massive Parallel Simulator (part I)

araffin.github.io/post/sac-mas...
Getting SAC to Work on a Massive Parallel Simulator: An RL Journey With Off-Policy Algorithms (Part I) | Antonin Raffin | Homepage
This post details how I managed to get the Soft-Actor Critic (SAC) and other off-policy reinforcement learning algorithms to work on massively parallel simulators (think Isaac Sim with thousands of ro...
araffin.github.io
March 10, 2025 at 8:22 AM
I’m heading to AAAI to present our work on multi-objective preference alignment for DPO from my internship with GoogleAI. If anyone wants to chat about RLHF, RL in games, curriculum learning, or open-ended environments please reach out!
February 26, 2025 at 8:29 PM
Reposted by Ryan Sullivan
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
February 24, 2025 at 3:25 PM
Reposted by Ryan Sullivan
We released the OLMo 2 report! Ready for some more RL curves? 😏

This time, we applied RLVR iteratively! Our initial RLVR checkpoint on the RLVR dataset mix shows a low GSM8K score, so we did another RLVR on GSM8K only and another on MATH only 😆.

And it works! A thread 🧵 1/N
January 6, 2025 at 6:34 PM
Reposted by Ryan Sullivan
My recurrent refrain of the year is to really use the environments in pufferlib. There’s no reason not to have your environments run at a million fps on a single cpu core github.com/PufferAI/Puf...
GitHub - PufferAI/PufferLib: Simplifying reinforcement learning for complex game environments
Simplifying reinforcement learning for complex game environments - PufferAI/PufferLib
github.com
December 9, 2024 at 2:47 PM
Have you ever wanted to add curriculum learning (CL) to an RL project but decided it wasn't worth the effort?

I'm happy to announce the release of Syllabus, a library of portable curriculum learning methods that work with any RL code!

github.com/RyanNavillus...
GitHub - RyanNavillus/Syllabus: Synchronized Curriculum Learning for RL Agents
Synchronized Curriculum Learning for RL Agents. Contribute to RyanNavillus/Syllabus development by creating an account on GitHub.
github.com
December 5, 2024 at 4:11 PM
Another awesome iteration of Genie! I fully agree with training generalist agents in simulation like this, though I believe in using real games to teach long-term strategies. Still, it’s easy to see how SIMA and Genie will continue to improve, and maybe even give us a true foundation model for RL.
Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.
December 4, 2024 at 7:55 PM
This is one of my favorite lines of work in RL. When I was starting my PhD, I was working on a multi-agent evaluation problem, having just finished a “voting math” class my last semester at Purdue. I scribbled some notes about how games in a tournament could be viewed as votes… 1/2
Ok, time to start posting some actual AI things. 😅

This week I will tell you about several papers in the theme of social choice theory (and agent/model evals), starting with an old paper of ours that I am still excited about:

"Evaluating Agents using Social Choice Theory"

🧵 1/N
November 28, 2024 at 2:49 AM
I just got here, thanks @rockt.ai for putting together an open-endedness starter pack! If there's anyone else working on exploration, curriculum learning, or open-ended environments, leave a reply so I can follow you!

I'll be sharing some cool curriculum learning work in a few days, stay tuned!
Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD
November 22, 2024 at 5:35 PM