Sarvesh Patil
banner
nagababa.bsky.social
Sarvesh Patil
@nagababa.bsky.social
Your friendly neighborhood roboticist!
PhD student @cmurobotics.bsky.social

Interested in Dexterous Manipulation, Democratization of Robots and Sensors, Sample Efficient RL, Soft Robotics, Causality, Multi-Agent Systems.

servo97.github.io
Reposted by Sarvesh Patil
It was a dream come true to teach the course I wish existed at the start of my PhD. We built up the algorithmic foundations of modern-day RL, imitation learning, and RLHF, going deeper than the usual "grab bag of tricks". All 25 lectures + 150 pages of notes are now public!
June 20, 2025 at 3:53 AM
Reposted by Sarvesh Patil
I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.

arxiv.org/abs/2405.06161
A First Introduction to Cooperative Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized ...
arxiv.org
January 7, 2025 at 4:25 PM
Reposted by Sarvesh Patil
The are lots of people who've influenced AI but haven't won Nobel prizes.
I discuss a tiny sliver of them in this parody of @billyjoelofficial.bsky.social 's "We didn't start the fire"...
Enjoy!

youtube.com/shorts/qDSYA...
We Didn't Win A Nobel (Billy Joel Parody)
YouTube video by MUSICODE
youtube.com
December 23, 2024 at 1:43 PM
Reposted by Sarvesh Patil
If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.
December 19, 2024 at 12:55 AM
Reposted by Sarvesh Patil
A common question nowadays: Which is better, diffusion or flow matching? 🤔

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
December 2, 2024 at 6:45 PM
Reposted by Sarvesh Patil
Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!

We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423
November 27, 2024 at 9:00 AM
Intro Post
Hello World!
I'm a 2nd year Robotics PhD student at CMU, working on distributed dexterous manipulation, accessible soft robots and sensors, sample efficient robot learning, and causal inference.

Here are my cute robots:
PS: Videos are old and sped up. They move slower in real-world :3
November 23, 2024 at 6:49 PM