Lightnews — Scholar-powered news

Emily Cheng

@emcheng.bsky.social

110 followers 240 following 6 posts

https://generalstrikeus.com/
PhD student in computational linguistics at UPF
chengemily1.github.io

Previously: MIT CSAIL, ENS Paris

Barcelona

Posts Replies Media Videos

Emily Cheng

@emcheng.bsky.social

arxiv.org/abs/2405.15471

with Diego Doimo, Corentin Kervadec, Iuri Macocco, Jade Yu, Alessandro Laio, and Marco Baroni.

6/6

Emergence of a High-Dimensional Abstraction Phase in Language Transformers

A language model (LM) is a mapping from a linguistic context to an output token. However, much remains to be known about this mapping, including how its geometric properties relate to its function. We...

arxiv.org

February 2, 2025 at 6:53 PM

Emily Cheng

@emcheng.bsky.social

3️⃣LLMs that are better at next-token prediction have higher, earlier ID peaks.

5/6

February 2, 2025 at 6:53 PM

Emily Cheng

@emcheng.bsky.social

2️⃣ The ID peak (beige) is where different LLMs are most similar (big shapes).

All LLMs share this high-dimensional phase of linguistic abstraction, but...

4/6

February 2, 2025 at 6:53 PM

Emily Cheng

@emcheng.bsky.social

... the ID peak marks where syntactic, semantic, and abstract linguistic features like toxicity and sentiment are first decodable.

⭐use these layers for downstream transfer!

(e.g., for brain encoding models, see arxiv.org/abs/2409.05771)

3/6

February 2, 2025 at 6:53 PM

Emily Cheng

@emcheng.bsky.social

1️⃣ The ID peak is linguistically relevant.

- it collapses on shuffled text (destroying syntactic/semantic structure)
- it grows over the course of training...

2/6

February 2, 2025 at 6:53 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news