Lightnews — Scholar-powered news

Deniz Bayazit

@bayazitdeniz.bsky.social

8 followers 13 following 7 posts

#NLProc PhD student @EPFL

#interpretability

Posts Replies Media Videos

Deniz Bayazit

@bayazitdeniz.bsky.social

7/ Work done with amazing collaborators Aaron Mueller
@amuuueller.bsky.social and Antoine Bosselut @abosselut.bsky.social !

Paper: arxiv.org/abs/2509.05291
Code: github.com/bayazitdeniz...

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining

Large language models (LLMs) learn non-trivial abstractions during pretraining, like detecting irregular plural noun subjects. However, it is not well understood when and how specific linguistic abili...

arxiv.org

September 25, 2025 at 2:02 PM

Deniz Bayazit

@bayazitdeniz.bsky.social

6/ Concurrently, recent work shows broad phases of concept evolution (statistical→feature learning) with sparse crosscoders; we track causal dynamics of specific concepts over time and across languages with RelIE, giving a fuller and deeper view.

arxiv.org/abs/2509.17196

Evolution of Concepts in Language Model Pre-Training

Language models obtain extensive capabilities through pre-training. However, the pre-training process remains a black box. In this work, we track linear interpretable feature evolution across pre-trai...

arxiv.org

September 25, 2025 at 2:02 PM

Deniz Bayazit

@bayazitdeniz.bsky.social

5/ Looking closer, feature sharing has limits: in Hindi & Arabic, overlap stays low even at 341B tokens. This may be due to richer agreement systems (e.g., verbs agreeing w/ subjects & objects) forcing BLOOM to keep language-specific features—or simply data scarcity!

September 25, 2025 at 2:02 PM

Deniz Bayazit

@bayazitdeniz.bsky.social

4/ In #multilingual models, cross-language feature overlap starts low and rises with training. At 6B tokens in BLOOM, most detectors are language-specific or for punctuation; by 341B tokens shared crosslingual features emerge, capturing syntactic abstractions over token patterns.

September 25, 2025 at 2:02 PM

Deniz Bayazit

@bayazitdeniz.bsky.social

3/ Which features matter early but fade, and which gain importance later? In Pythia, token-level detectors drop out, while higher-level grammatical features—like plural-noun detectors and nouns formed from verbs (e.g., runner from run)—strengthen by 286B tokens.

September 25, 2025 at 2:02 PM

Deniz Bayazit

@bayazitdeniz.bsky.social

2/ We align critical checkpoints for a task with sparse crosscoders, measure each feature’s causal role, and introduce RelIE to compare their influence across checkpoints. This lets us trace how internal features shift—and when they matter—in models like Pythia, OLMo, and BLOOM.

September 25, 2025 at 2:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news