Lightnews — Scholar-powered news

Houjun Liu

@jemoka.com

290 followers 610 following 59 posts

NLP & POMDPs; CS@Stanford; gradient descent enthusiast

www: jemoka.com
ac: nlp.stanford.edu/~houjun/

Posts Replies Media Videos

Houjun Liu

@jemoka.com

Better yet, without us teaching the model to do this at all, it learned to allocate more compute at tokens of higher entropy (even as measured by an independently trained model of the same architecture), and use less compute where there's either too little or too much entropy. 🤯

October 2, 2025 at 3:54 PM

Houjun Liu

@jemoka.com

By just using our approach, you don't have to do any extra work to get pretraining gains! We show across scale AND computation match that our approach performs better in pretraining perplexity than both regular transformers and manually inserting non-adaptive thinking tokens. 🥳

October 2, 2025 at 3:54 PM

Houjun Liu

@jemoka.com

We design an transformer variant that uses a score-attenuated "forking" mechanism to clone useful residuals the model wants to update and attend to, thus creating a 𝗯𝘂𝗯𝗯𝗹𝗲 of latent computation for those highly-informative tokens.

October 2, 2025 at 3:54 PM

Houjun Liu

@jemoka.com

Introducing 𝘁𝗵𝗼𝘂𝗴𝗵𝘁𝗯𝘂𝗯𝗯𝗹𝗲𝘀: a *fully unsupervised* LM for input-adaptive parallel latent reasoning

✅ Learn yourself a reasoning model with normal pretraining
✅ Better perplexity compared to fixed thinking tokens

No fancy loss, no chain of thought labels 🚀

October 2, 2025 at 3:54 PM

Houjun Liu

@jemoka.com

Even across baseline methods, low-perplexity prompts result in more effective attacks, but optimizing for attack success alone results in high-perplexity prompts.

August 20, 2025 at 7:51 PM

Houjun Liu

@jemoka.com

In fact, our method allows us to discover a Pareto tradeoff (🤯) between attack success and prompt likelihood; tuning a single parameter in our method travels along the Pareto-optimal front.

August 20, 2025 at 7:51 PM

Houjun Liu

@jemoka.com

Using the Adaptive Stress Testing (AST) framework as a reward signal for an online DPO-based optimization, we present a method to discover **both** high-probability prompts that are also successful in attacks.

August 20, 2025 at 7:51 PM

Houjun Liu

@jemoka.com

New Paper Day! For EMNLP findings—in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!

August 20, 2025 at 7:51 PM

Houjun Liu

@jemoka.com

Through a 🤏 pinch of interp, we show that model editing success gets degraded by pretraining with dropout.

Dispersed representations built by dropout => less consistent representation of the world => worse models.