Lightnews — Scholar-powered news

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

The Golden pacifiers are ready
See you soon in BabyLM (emnlp)

November 1, 2025 at 3:43 AM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

They did it for images, video, text and it all compresses really, really well.

October 6, 2025 at 4:47 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

So we get on average short numbers to represent sentences. and to decode them we run the model again, get probabilities and decide with those what next word to give to the model.

October 6, 2025 at 4:47 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

LLM, VLMs, ... can compress data
3x over JPEG\PNG etc.
6x Zlib, gzip etc.
How?
We all know they provide a probability over data, which is all classical compression needs
(arithmetic coding, see below)
Understanding is compressing, but this time not by the weights themselves
🤖📈🧠
#AI #compress #data

October 6, 2025 at 4:47 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Thus, a "feature" is defined by the sparse activations we find.
And these are shifting quite rapidly at a certain part in training

September 26, 2025 at 3:27 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

How can we do it
So crosscoders map activations into a sparse representations and to decode those back into the activations (classic compress decompress).
A single crosscoder is then trained to map activations of all pretrain checkpoints, creating a shared space

September 26, 2025 at 3:27 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Employing mechanistic interpretability to study how models learn, not just where they end up
2 papers find:
There are phase transitions where features emerge and stay throughout learning
🤖📈🧠
alphaxiv.org/pdf/2509.17196
@amuuueller.bsky.social @abosselut.bsky.social
alphaxiv.org/abs/2509.05291

September 26, 2025 at 3:27 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

We also hope that attentive readers recognize our section titles are organized as a step-by-step plan!

September 24, 2025 at 6:08 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

They found that it is really hard to predict what is helpful (I wonder if it is because helpful itself is quite noisy, how predictable is it in general? with the best information?)
But also that plans, even bad ones help LLMs' and humans performance (but slow them down)

September 24, 2025 at 6:08 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

The authors tasked many people with solving complicated questions based on information from step by step plans. And checked which plan helps more taking into account if it helped strong solvers (with IRT).

arxiv.org/abs/2509.18632
@nbalepur.bsky.social

September 24, 2025 at 6:08 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Helpfulness is what we are after, and we test it by asking humans for preferences, or reward models.
and they fail😆

They show that humans are bad at predicting what is helpful, so are reward models (all close to chance).
Reward models don't even predict what helps LLMs
RL🤔
🤖📈🧠
#AI #LLM

September 24, 2025 at 6:08 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Good luck with the
@iclr_conf
writing
Know anyone who needs tips?
Want a graph checklist?
Know any good tips you wanna add?

The writing guide:
docs.google.com/document/d/1...

September 17, 2025 at 5:43 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

This is obviously not sustainable, and kills the internet (see other papers by @shaynelongpre.bsky.social and @stellaathena.bsky.social )
They also foresee that the amount of unpaid labour would continue to grow, with the demand for data.
arxiv.org/pdf/2504.12427

September 12, 2025 at 2:20 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

The most expensive part of training is the data, not the compute
Nikhil Kandpal & Colin Raffel calculate a really low bar for how much it would cost to produce LLM training data with 3.8$\h
Well, several scales more than the compute.
Luckily (?), companies don't pay for the data
🤖📈🧠

September 12, 2025 at 2:20 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

A dataset of ancient Chinese writings to study with LLMs, including 170K sentences for pretraining
With 10K words, mapping to modern word (when applicable)
There are so many fascinating questions out there
www.arxiv.org/abs/2508.15791

August 25, 2025 at 8:09 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

ChatGPT agrees with you ...

August 15, 2025 at 8:36 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Prob is well correlated with more training, and so is the loss... This is just like perplexity, and doesn't test knowledge\bias\...
As support, the wrong answer is highly correlated with the right answer, so most of the signal comes from the sentence and form, not knowledge.

August 14, 2025 at 8:15 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

When we get further away from next token prediction, we get side effects that make a low correlation between flops in training and score.
For example, negative answers can be reranked among them and change whether the right answer is picked or accuracy ignores a 49-51 confidence.

August 14, 2025 at 8:15 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Given a dataset of multiple-choice questions, we can compute
🔻(log)probability of the right answer
🔻Probability of the right answer normalized by the probability of the rest of the answers
🔻A metric such as accuracy or Brier
Each step gets us further from next token pred.

August 14, 2025 at 8:15 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Why is it hard to predict downstream scores from pretraining?
For many reasons such as domain, mismatch between current abilities and what post training unfolds, "emergence" etc.
A big factor is that next token prediction != choice comparison != accuracy
www.alphaxiv.org/abs/2406.04391

August 14, 2025 at 8:15 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

The right path is unclear; maybe it doesn't even exist.
Still, LLMs appear to consistently follow the values of secular\rational people who strive for self-expression (sounds like me😅)

To show it they collect and release
200K human-model chats+feedback, 5 languages and 21 LLMs
🤖📈🧠

August 11, 2025 at 11:55 AM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Pressing matters in evaluation from GEMs talk
Remember, exciting questions drive science, exciting answers follow.
Setting the right goal may make all their sota chasing worthwhile.
Make an insightful dataset, lead by evaluation
🤖📈🧠

August 5, 2025 at 11:40 AM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Something like that?

July 24, 2025 at 3:00 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

ICML proposes we use sophisticated jailbreaks in our papers?
Ones that trick the reviewers but do not raise our scores?
Proposals?

July 24, 2025 at 3:00 PM

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

We don't understand loss spikes, that's clear.
We've learned recently that data deterministically makes spikes regardless of optimizer.
What did we see when we stopped pretraining, and then continue?
A huge spike and never a recovery? Why?
Apparently the momentum matters, a lot.
🤖📈🧠

July 22, 2025 at 11:58 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news