Emily Cheng
emcheng.bsky.social
Emily Cheng
@emcheng.bsky.social
https://generalstrikeus.com/
PhD student in computational linguistics at UPF
chengemily1.github.io

Previously: MIT CSAIL, ENS Paris

Barcelona
arxiv.org/abs/2405.15471

with Diego Doimo, Corentin Kervadec, Iuri Macocco, Jade Yu, Alessandro Laio, and Marco Baroni.

6/6
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
A language model (LM) is a mapping from a linguistic context to an output token. However, much remains to be known about this mapping, including how its geometric properties relate to its function. We...
arxiv.org
February 2, 2025 at 6:53 PM
3️⃣LLMs that are better at next-token prediction have higher, earlier ID peaks.

5/6
February 2, 2025 at 6:53 PM
2️⃣ The ID peak (beige) is where different LLMs are most similar (big shapes).

All LLMs share this high-dimensional phase of linguistic abstraction, but...

4/6
February 2, 2025 at 6:53 PM
... the ID peak marks where syntactic, semantic, and abstract linguistic features like toxicity and sentiment are first decodable.

⭐use these layers for downstream transfer!

(e.g., for brain encoding models, see arxiv.org/abs/2409.05771)

3/6
February 2, 2025 at 6:53 PM
1️⃣ The ID peak is linguistically relevant.

- it collapses on shuffled text (destroying syntactic/semantic structure)
- it grows over the course of training...

2/6
February 2, 2025 at 6:53 PM