Lightnews — Scholar-powered news

Reposted by Vaishnavh Nagarajan

@csdatcmu.bsky.social

Congratulations to CSD faculty Aditi Raghunathan and her research collaborators on receiving an ICML Outstanding Paper award for Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction (icml.cc/virtual/2025...).

Paper: arxiv.org/abs/2504.15266

ICML 2025 AwardsICML 2025

icml.cc

July 17, 2025 at 2:43 PM

Reposted by Vaishnavh Nagarajan

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Reading the dedications of a PhD thesis is often a cure for a bad day. There’s so much affection in them

July 3, 2025 at 11:05 PM

Reposted by Vaishnavh Nagarajan

Ahmad Beirami

@abeirami.bsky.social

As NeurIPS review deadline is around the corner, please remember that you cannot use any non-local LLM like chatgpt/gemini for understanding the paper and drafting/revising your review as that breaks the confidentiality agreement.

NeurIPS 2025 Official LLM Policy:
neurips.cc/Conferences/...

LLM Policy

neurips.cc

July 2, 2025 at 12:44 PM

Reposted by Vaishnavh Nagarajan

Giovanni Toffetti

@gtof.bsky.social

I really enjoyed "When We Cease to Understand the World", although it's more fiction than history of science

June 22, 2025 at 2:30 PM

Reposted by Vaishnavh Nagarajan

Javier Burroni

@jburroni.bsky.social

“Science in history” by Bernal is my first recommendation. The work of Ian Hacking is a good recommendation for
Probability

June 23, 2025 at 2:12 AM

Reposted by Vaishnavh Nagarajan

Alexandra Proca

@aproca.bsky.social

How do task dynamics impact learning in networks with internal dynamics?

Excited to share our ICML Oral paper on learning dynamics in linear RNNs!
with @clementinedomine.bsky.social @mpshanahan.bsky.social and Pedro Mediano

openreview.net/forum?id=KGO...

Learning dynamics in linear recurrent neural networks

Recurrent neural networks (RNNs) are powerful models used widely in both machine learning and neuroscience to learn tasks with temporal dependencies and to model neural dynamics. However, despite...

openreview.net

June 20, 2025 at 5:29 PM

Reposted by Vaishnavh Nagarajan

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

When we are doing science, we are unknowingly executing our mythology, taken from movies and friends and textbooks, of what science is. History of science helps us ground that myth in reality

Vaishnavh Nagarajan @vaishnavh.bsky.social · Jun 12

I finally wrote a full-fledged blog about this: reading the history of science is an **amazing** yet under-recognized way to develop (emotional) maturity as a researcher.

If you have thoughts/recommendations, please share!
vaishnavh.github.io/2025/04/29/h...

June 21, 2025 at 11:04 PM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

I finally wrote a full-fledged blog about this: reading the history of science is an **amazing** yet under-recognized way to develop (emotional) maturity as a researcher.

If you have thoughts/recommendations, please share!
vaishnavh.github.io/2025/04/29/h...

June 12, 2025 at 11:45 PM

Reposted by Vaishnavh Nagarajan

Tiago Pimentel

@tpimentel.bsky.social

A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an LM trained from scratch in each situation! Our new ACL paper proposes an observational method to estimate this causal effect! Longer thread soon!

Title of paper "Causal Estimation of Tokenisation Bias" and schematic of how we define tokenisation bias, which is the causal effect we are interested in.

June 4, 2025 at 10:51 AM

Reposted by Vaishnavh Nagarajan

Andrew Saxe

@saxelab.bsky.social

How does in-context learning emerge in attention models during gradient descent training?

Sharing our new Spotlight paper @icmlconf.bsky.social: Training Dynamics of In-Context Learning in Linear Attention
arxiv.org/abs/2501.16265

Led by Yedi Zhang with @aaditya6284.bsky.social and Peter Latham

June 4, 2025 at 11:22 AM

Reposted by Vaishnavh Nagarajan

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

This paper is quite nice. It mixes some useful toy models of creativity with insights about how to induce more creativity in LLMs that are better than greedy sampling

Vaishnavh Nagarajan @vaishnavh.bsky.social · Jun 2

📢 New #paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue:

→ LLMs are limited in creativity as they learn to predict the next token

→ creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ #MLSky #AI #arxiv 🧵👇🏽

June 2, 2025 at 9:31 PM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

📢 New #paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue:

→ LLMs are limited in creativity as they learn to predict the next token

→ creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ #MLSky #AI #arxiv 🧵👇🏽

June 2, 2025 at 5:26 PM

Reposted by Vaishnavh Nagarajan

Nathan Lambert

@natolambert.bsky.social

This isn't fake news. One of the craziest AI research papers I've been on in a while. Weird ablations on RLVR shows that the Qwen 2.5 models can learn with literally random rewards, likely due to some funkiness in mid-training and the GRPO setup.

May 27, 2025 at 4:49 PM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

can someone reconcile these two contradictory findings?!

two papers find entropy *minimization*/confidence maximization helps performance,
and the RL-on-one-sample finds entropy maximization/increasing exploration alone helps performance?!

May 27, 2025 at 3:53 PM

Reposted by Vaishnavh Nagarajan

Core Francisco Parkg

@corefpark.bsky.social

🚨 New Paper!

A lot happens in the world every day—how can we update LLMs with belief-changing news?

We introduce a new dataset "New News" and systematically study knowledge integration via System-2 Fine-Tuning (Sys2-FT).

1/n

May 21, 2025 at 12:07 AM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

The contextual shadowing effect in this paper is interesting & reminds me of the infamous "cow-in-a-beach" spurious correlation examples. The model needs to learn a core feature (here. the store-in-memory feature) but it instead relies on a simpler "spurious" feature (the in-context feature).

Core Francisco Parkg @corefpark.bsky.social · May 21

⚠️⚠️ But here comes drama!!!

What if the news appears in the context upstream of the *same* FT data?

🚨 Contextual Shadowing happens!

Prefixing the news during FT *catastrophically* reduces learning!

10/n

May 21, 2025 at 12:50 AM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

My #neurips bidding pool is REALLY bad and I see that I'm not alone. Was the pool restricted to 100 papers to break collusion? if so, seems like everyone else is going to pay for this

May 18, 2025 at 1:07 PM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

looks like a fun workshop!

Ben Edelman @benedelman.bsky.social · May 8

What if there were a workshop dedicated to *small-scale*, *reproducible* experiments? What if this were at ICML 2025? What if your submission (due May 22nd) could literally be a Jupyter notebook?? Pretty excited this is happening. Spread the word! sites.google.com/view/moss202...

May 8, 2025 at 2:46 PM

Reposted by Vaishnavh Nagarajan

Ahmad Beirami

@abeirami.bsky.social

Excited that our paper "safety alignment should be made more than just a few tokens deep" was recognized as an #ICLR2025 Outstanding Paper!

We identified a common root cause to many safety vulnerabilities and pointed out some paths forward to address it!

April 23, 2025 at 10:22 PM

Reposted by Vaishnavh Nagarajan

Nikhil Garg

@nkgarg.bsky.social

At the moment, we just show you posts by people you follow. Following someone will make their paper posts appear! In the future, we'll expand this (slightly) to show you other paper posts that may be of interest

May 7, 2025 at 3:59 PM

Reposted by Vaishnavh Nagarajan

William B. Fuckley

@opinionhaver.bsky.social

Yeah this is actual unironic advice for any 1st years: you seriously need to go get beers or otherwise hangout socially with more senior grad students (no one else will really know) and hopefully get ensconced enough that someone will tell you “hey just fyi Dr. GoodPubs is low-key a total sociopath”

Brian Chang @brianchang.bsky.social · May 5

It’s another way nepotism manifests? You really can’t know who would be a good or bad advisor other than rumors. If you have parents in academia, they can help you sift through the rumors

May 5, 2025 at 3:52 AM

Reposted by Vaishnavh Nagarajan

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

@vaishnavh.bsky.social and crew do it again:
- a benchmark for open-ended creativity
- a demonstration of challenges of next-token prediction
- a technique to improve transformer randomness through inputs not sampling
arxiv.org/abs/2504.15266

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day l...

arxiv.org

May 5, 2025 at 11:23 PM

Vaishnavh Nagarajan

@vaishnavh.bsky.social

@icmlconf.bsky.social (assuming this is the legit ICML account), the Canadian visa application requires providing the inviting contact to have a Canadian address. But the one in the visa letter is a US address. Could you look into this?

May 6, 2025 at 1:17 PM

Reposted by Vaishnavh Nagarajan

Ahmad Beirami

@abeirami.bsky.social

If you are at #AISTATS2025 and are interested in concept erasure, talk to @somnathbrc.bsky.social at Poster Session 1 on Saturday May 3.

May 3, 2025 at 12:47 AM

Reposted by Vaishnavh Nagarajan

NeurIPS Conference

@neuripsconf.bsky.social

Responsible reviewing initiatives for NeurIPS 2025 - read more about changes to reviewing that that will safeguard reviewing quality and timeline in our blog post below:
blog.neurips.cc/2025/05/02/r...

Responsible Reviewing Initiative for NeurIPS 2025 – NeurIPS Blog

Communications Chairs 2025 2021 Conference

blog.neurips.cc

May 2, 2025 at 10:45 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news