Emily Cheng
emcheng.bsky.social
Emily Cheng
@emcheng.bsky.social
https://generalstrikeus.com/
PhD student in computational linguistics at UPF
chengemily1.github.io

Previously: MIT CSAIL, ENS Paris

Barcelona
3️⃣LLMs that are better at next-token prediction have higher, earlier ID peaks.

5/6
February 2, 2025 at 6:53 PM
2️⃣ The ID peak (beige) is where different LLMs are most similar (big shapes).

All LLMs share this high-dimensional phase of linguistic abstraction, but...

4/6
February 2, 2025 at 6:53 PM
... the ID peak marks where syntactic, semantic, and abstract linguistic features like toxicity and sentiment are first decodable.

⭐use these layers for downstream transfer!

(e.g., for brain encoding models, see arxiv.org/abs/2409.05771)

3/6
February 2, 2025 at 6:53 PM
1️⃣ The ID peak is linguistically relevant.

- it collapses on shuffled text (destroying syntactic/semantic structure)
- it grows over the course of training...

2/6
February 2, 2025 at 6:53 PM
Here's our work accepted to #ICLR2025!

We look at how intrinsic dimension evolves over LLM layers, spotting a universal high-dimensional phase.

This ID peak is where:

- linguistic features are built
- different LLMs are most similar,

with implications for task transfer

🧵 1/6
February 2, 2025 at 6:53 PM