Sai Prasanna
saiprasanna.in
Sai Prasanna
@saiprasanna.in
See(k)ing the surreal

Causal World Models for Curious Robots @ University of Tübingen/Max Planck Institute for Intelligent Systems 🇩🇪

#reinforcementlearning #robotics #causality #meditation #vegan
Pinned
📌 Thread of threads for research ideas 💡 Collaborations are most welcome 😁
Use Beta NLL for regression when you also predict standard deviations, a simple change to NLL that works reliably better.
September 10, 2025 at 9:27 AM
If open-endedness has to be fundamentally subjectively measured, what are the factors of the agent makes it so if we fix humans as the final arbiter or evaluator. Does embodiment/action space etc of the agent matter for a human evaluator of open-endedness?
August 2, 2025 at 11:53 PM
Tübingen: Freiburg:: Introvert:Extrovert
March 27, 2025 at 1:50 PM
Reposted by Sai Prasanna
This might be the most fun I’ve had writing an essay in a while. Felt some of that old going-nuts-with-an-idea energy flowing.

open.substack.com/pub/contrapt...
Discworld Rules
And LOTR is brain-rot for technologists
open.substack.com
March 8, 2025 at 2:53 AM
Reposted by Sai Prasanna
This week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024).

If you know 1 of {RL, controls} and want to understand the other, this is a good starting point.

PDF: arxiv.org/abs/2406.00592
Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming
In this paper we describe a new conceptual framework that connects approximate Dynamic Programming (DP), Model Predictive Control (MPC), and Reinforcement Learning (RL). This framework centers around ...
arxiv.org
March 2, 2025 at 4:19 PM
I realized how I background process tonnes of information, from work/research and emotional stuff. And it works well, leads to good research ideas, wise processing of tough situations! But It's so hard to learn to trust this as conscious thinking for solving problems feels more under my "control"
March 1, 2025 at 10:02 PM
TIL: "Clever Hans cheat" for next-token prediction. A subtle but interesting issue with next-token prediction. In the purely forward next token prediction objective, teacher forcing can lead to learning dynamics where the models don't even generalize "in-distribution"!!

arxiv.org/abs/2403.06963
The pitfalls of next-token prediction
Can a mere next-token predictor faithfully model human intelligence? We crystallize this emerging concern and correct popular misconceptions surrounding it, and advocate a simple multi-token objective...
arxiv.org
March 1, 2025 at 9:29 PM
Break the Monday Productivity ceiling with this super awesome 4 hour techno set

on.soundcloud.com/hXTcWTTsYUNK...
Yetti Meissner @ Sisyphos Hammerhalle 09/08/14
🖤 BOOKING CONTACT chris@stilvortalent.de
on.soundcloud.com
January 27, 2025 at 1:56 PM
Enimatek
Kore-G · Enimatek · Song · 2023
open.spotify.com
January 20, 2025 at 10:56 AM
If I have a really good photo that could be potentially used in many contexts, what's the best place to make money with it? My friend has a really good eye for photos and we want to try a side venture selling some of her stuff
December 30, 2024 at 3:19 PM
Reposted by Sai Prasanna
RIP Manmohan Singh. Dude changed all our lives in 1991 for the better. His stint as turnaround finance minister was revolutionary even if his later stint as PM was rather hapless (for which Nehru dynasty is more to blame).
Manmohan Singh - Wikipedia
en.wikipedia.org
December 27, 2024 at 3:44 AM
Reposted by Sai Prasanna
Looks like a cool study. Lots to learn from ants about large scale coordination
www.pnas.org/doi/10.1073/...

"Our results exemplify how simple minds can easily enjoy scalability while complex brains require extensive communication to cooperate efficiently."

h/t @petersuber.bsky.social
Comparing cooperative geometric puzzle solving in ants versus humans | PNAS
Biological ensembles use collective intelligence to tackle challenges together, but suboptimal coordination can undermine the effectiveness of grou...
www.pnas.org
December 25, 2024 at 9:57 PM
This album is going to be timeless

open.spotify.com/album/32yQDx...
Mahal
Glass Beams · EP · 2024 · 5 songs
open.spotify.com
December 26, 2024 at 4:10 PM
Doing The Beeston Bump
Leafcutter John · Yes! Come Parade With Us · Song · 2019
open.spotify.com
December 18, 2024 at 5:54 PM
Does augmenting ourselves with V/LLMs to cognitive gaps make self actualization even more difficult on average?

Stands stark in contrast with (more difficult/slower to show positive outcomr) augmentation strategies like meditation or psychedelics
December 18, 2024 at 10:40 AM
Reposted by Sai Prasanna
The slides for my lectures on (Bayesian) Active Learning, Information Theory, and Uncertainty are online now 🥳 They cover quite a bit from basic information theory to some recent papers:

blackhc.github.io/balitu/

and I'll try to add proper course notes over time 🤗
December 17, 2024 at 6:50 AM
Manifold garden is a trippy game

www.youtube.com/watch?v=vLt4...
Manifold Garden - Launch Trailer | PS4
YouTube video by PlayStation
www.youtube.com
December 16, 2024 at 4:26 PM
Reposted by Sai Prasanna
Check out Motivo, a behavioral foundation model for humanoid control by FAIR.

It's a one-of-its-kind unsupervised RL project, and it comes with a demo that is SO fun to play with!

metamotivo.metademolab.com

(for the record, they use compile and cudagraphs -> github.com/facebookrese...)
December 14, 2024 at 12:44 AM
Reposted by Sai Prasanna
Modern life is a Turing tarpit: “Everything is possible, but nothing is easy”

By contrast any traditional lifestyle is sub-Turing

All the people pining for rituals, steady routines, deep work etc etc etc… YOU CAN’T HANDLE THE TURING COMPLETENESS

en.wikipedia.org/wiki/Turing_...
Turing tarpit - Wikipedia
en.wikipedia.org
December 13, 2024 at 2:13 AM
Reposted by Sai Prasanna
If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!
December 10, 2024 at 9:55 PM
One thing coding with LLMs has helped me a lot during the past months is for visualisations. I'm churning out code to visualize many aspects of agent behavior which I wouldn't have done before due to my mental friction in writing such code.

Such code to do visualisations is also easy to verify.
December 11, 2024 at 5:59 PM
When predicting discrete joint distribution of two variables with a neural network, what loss is the best to use? KL on the joint and two marginals? Or is there anything better?
December 11, 2024 at 5:37 PM
Reposted by Sai Prasanna
In an effort to play a small part in creating additional value on this site, I'm going to post one-per-day a paper we wrote that was published in 2024. Together with memes. Skipping holidays/weekends. In random order.

Would love your thoughts on them.

I'll keep them threaded for easy finding!

>
November 25, 2024 at 9:34 AM
Reposted by Sai Prasanna
The RL book by Kevin Murphy is finally online (copied shamelessly from the other place) arxiv.org/abs/2412.05265
Reinforcement Learning: An Overview
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement learning and sequential decision making, covering value-based RL, policy-gradient methods, model-based met...
arxiv.org
December 9, 2024 at 6:25 AM