Yoav Gur Arieh
yoav.ml
Yoav Gur Arieh
@yoav.ml
Two weeks ago I posted about our recent paper, which shows that to bind entities, LMs use three mechanisms: positional, lexical and reflexive.

We were curious how these mechanisms develop throughout training, so we evaluated their existence across OLMo checkpoints 👇
October 21, 2025 at 7:40 PM
To compensate for this, LMs use two additional mechanisms.

The first is *lexical*, where the LM retrieves the subject next to "Michael". It does this by copying the lexical contents of "Holly" to "Michael", binding them together. 3/
October 8, 2025 at 2:56 PM
Prior work identified only a positional mechanism, where the model tracks entities by position: here retrieving the subject from the first clause "Holly".

We show this isn’t sufficient—the positional signal is strong at the edges of context but weak and diffuse in the middle. 2/
October 8, 2025 at 2:56 PM
🧠 To reason over text and track entities, we find that language models use three types of 'pointers'!

They were thought to rely only on a positional one—but when many entities appear, that system breaks down.

Our new paper shows what these pointers are and how they interact 👇
October 8, 2025 at 2:56 PM
This is a step toward targeted, interpretable, and robust knowledge removal — at the parameter level.

Joint work with Clara Suslik, Yihuai Hong, and @fbarez.bsky.social, advised by @megamor2.bsky.social
🔗 Paper: arxiv.org/abs/2505.22586
🔗 Code: github.com/yoavgur/PISCES
May 29, 2025 at 4:22 PM