Vaishnavh Nagarajan
banner
vaishnavh.bsky.social
Vaishnavh Nagarajan
@vaishnavh.bsky.social
Foundations of AI. I like simple and minimal examples and creative ideas. I also like thinking about the next token 🧮🧸

Google Research | PhD, CMU |

https://arxiv.org/abs/2504.15266 | https://arxiv.org/abs/2403.06963



vaishnavh.github.io
Reposted by Vaishnavh Nagarajan
I really enjoyed "When We Cease to Understand the World", although it's more fiction than history of science
June 22, 2025 at 2:30 PM
Reposted by Vaishnavh Nagarajan
“Science in history” by Bernal is my first recommendation. The work of Ian Hacking is a good recommendation for
Probability
June 23, 2025 at 2:12 AM
haha that's a new idiom for me. it's perfect! and the flip side is, "target and (potentially) regret", which causes quite a lot of stress. (what if your work gets rejected by the community or worse, overlooked)
June 5, 2025 at 3:58 PM
but these pressures are real and have always persisted.



I think @abeirami.bsky.social may be interested in this rant.
June 5, 2025 at 3:43 PM
but now I've the maturity to seek validation from things like "a specific person complimenting my work" or even better, "a meaningful citation where someone substantially builds on my work." (ofc, i also seek internal validation/satisfaction but I gotta be realistic, lol).
June 5, 2025 at 3:43 PM
i had intense first-hand struggle with a lot of these effects in my phd since i had <= 1 paper/year for the most part. i started managing it only after getting visibly recognized by experts for one of my papers at one point. i still struggle with it at some level.
June 5, 2025 at 3:43 PM
then there are many other insidious feedback cycles like the fact that publishing more => more visibility => more opportunities/networks/interfaces with the community/more citations => more opportunities/internships etc., => more papers
June 5, 2025 at 3:37 PM
for example, with the advent of twitter, there's a pressure to stay constantly visible and to have many different things to say every now and then (bec everyone else is doing that), rather than pitch your one paper again and again which starts feeling awkward :-(
June 5, 2025 at 3:33 PM
someday I hope to write a blog about "all the other forces that discourage me" from publishing less. people always say "publish less!" but without acknowledging these varied and nuanced forces
June 5, 2025 at 3:33 PM
all other incentivation strategies I had thought of are much more negative/mean. like
- "evaluating someone based on some bottom k papers" or
- "judging negatively for publishing >N papers"
June 5, 2025 at 3:30 PM
haha thank you! honored you feel that way!

btw, i just noticed this, this sort of a compliment is actually a great way to incentivize people to be more selective in publishing papers (and to counter all the other forces that discourage me from my rate of ~1 paper a year)
June 5, 2025 at 3:28 PM
Read the full paper: arxiv.org/abs/2504.15266

This work was with great collaborators at CMU: @chenhenrywu.bsky.social who co-led, Charles Ding & @adtraghunathan.bsky.social! Go follow them to see what else they’re up to! 11/
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day l...
arxiv.org
June 2, 2025 at 5:26 PM
But, there's a lot of scope for exciting work:
→ generalizing these insights to real cows,
→ studying RL/CoT for creativity,
→ understanding surprising behaviors of seed-conditioning 10/👇🏽
June 2, 2025 at 5:26 PM
Of course, this is all a study of spherical cows. 🐮
Given the noisy, subjective studies of real cows, we believe an objective study brings
→much-needed clarity of thought (like disentangling the two modes of creativity),
→more ideas,
→better-defined experiments. 9/👇🏽
June 2, 2025 at 5:26 PM
Our vision is that seed-conditioning can help models sample a latent thought and articulate that one thought into words,

but temp sampling has to articulate multiple latent thoughts in parallel to produce a marginal next-word distribution -- this is more burdensome! 8/👇🏽
June 2, 2025 at 5:26 PM
Next, we revisit how to produce randomness: the go-to temp sampling 🌡️ vs. injecting a random prefix (seed-conditioning). 🌱

Remarkably, seed-conditioning produces meaningful diversity even w *greedy* decoding 🤑; it is competitive with temp & in some conditions, superior. 7/👇🏽
June 2, 2025 at 5:26 PM