Nikunj Saunshi
nsaunshi.bsky.social
Nikunj Saunshi
@nsaunshi.bsky.social
AI Reasoning and Foundations
Senior Research Scientist, Google |
PhD, Princeton University
First post here, so sharing an earlier NeurIPS '24 paper on stacking and its inductive biases

TLDR: Stacking, i.e. growing model depth gradually, not only improves training efficiency (if done right), but significantly improves downstream tasks that require *reasoning*, at similar perplexity 1/n
March 10, 2025 at 3:41 PM
Reposted by Nikunj Saunshi
Excited to share our work with friends from MIT/Google on Learned Asynchronous Decoding! LLM responses often contain chunks of tokens that are semantically independent. What if we can train LLMs to identify such chunks and decode them in parallel, thereby speeding up inference? 1/N
February 27, 2025 at 12:38 AM