Lightnews — Scholar-powered news

Kayo Yin

@kayoyin.bsky.social

1.4K followers 180 following 37 posts

PhD student at UC Berkeley. NLP for signed languages and LLM interpretability. kayoyin.github.io
🏂🎹🚵‍♀️🥋

Posts Replies Media Videos

Kayo Yin

@kayoyin.bsky.social

We speculate that induction heads help models learn the more complex FV mechanism, which ultimately drives in-context learning 🤔

Paper: arxiv.org/abs/2502.14010

Which Attention Heads Matter for In-Context Learning?

Large language models (LLMs) exhibit impressive in-context learning (ICL) capability, enabling them to perform new tasks using only a few demonstrations in the prompt. Two different mechanisms have be...

arxiv.org

February 28, 2025 at 4:16 PM

Kayo Yin

@kayoyin.bsky.social

How to reconcile this with previous studies on ICL?

Key difference is that previous works:
- measure ICL using differences between token losses, which we find behaves differently to few-shot ICL accuracy
- don't control for overlap between induction and FV
- focus on small models

February 28, 2025 at 4:16 PM

Kayo Yin

@kayoyin.bsky.social

Other interesting findings:

- FV heads have relatively high induction scores and vice versa compared to other heads
- FV heads emerge later in training than induction heads
- ICL accuracy rises around the same time induction emerges during training, but increases more gradually

February 28, 2025 at 4:16 PM

Kayo Yin

@kayoyin.bsky.social

We also find evidence of induction heads that evolve into FV heads.

Several instances of FV heads have a high induction score earlier in training (around when induction heads first emerge). However, the reverse (induction heads with high FV scores earlier) does not occur.

February 28, 2025 at 4:16 PM

Kayo Yin

@kayoyin.bsky.social

2 mechanisms have been proposed to explain ICL: induction heads that find and copy relevant tokens, and FV heads that compute a latent encoding of the task from examples.

Our ablations show that FV heads are crucial for few-shot ICL, whereas induction heads are not necessary.

February 28, 2025 at 4:16 PM

Kayo Yin

@kayoyin.bsky.social

Thanks for the kind words, Seth 😊 glad you joined the dinner!

December 17, 2024 at 1:56 AM

Kayo Yin

@kayoyin.bsky.social

sad I’m not in town for this, looks super exciting!! 🍿

December 4, 2024 at 10:07 PM

Kayo Yin

@kayoyin.bsky.social

oof yeah I was afraid something like that was maybe going on. I hope she gets the help she needs…

December 4, 2024 at 9:26 PM

Kayo Yin

@kayoyin.bsky.social

ahh yes this is it thank you!! I hallucinated the end haha

November 26, 2024 at 11:20 PM

Kayo Yin

@kayoyin.bsky.social

glad to know at least I didn’t just make this up 😭 I think I heard it recently too but can’t remember at alll

November 26, 2024 at 12:05 PM

Kayo Yin

@kayoyin.bsky.social

Overall, handshapes in native ASL signs reflect communicative efficiency, but *not in signs borrowed from English*!

Check out our paper+code (w/ Terry Regier & Dan Klein) for more details and why we think that's the case: aclanthology.org/2024.acl-lon...

See you at TISLR in Ethiopia! ☀️ 8/8

November 21, 2024 at 5:40 AM

Kayo Yin

@kayoyin.bsky.social

What about perceptual effort - could it be correlated with English usage?

Perceptual effort to distinguish between 2 handshapes is very weakly correlated with how often the 2 letters appear in similar contexts in English, and in the "wrong" direction for efficiency. 7/8

The scatter plot is titled "Fingerspelling" and illustrates the relationship between handshape similarity and English letter confusability.

- **X-axis:** English letter confusability
- **Y-axis:** Handshape similarity

Points on the plot are labeled with pairs of letters representing different handshapes. There is a positive correlation line with the label (r=0.19, p=0.00), indicating a slight positive correlation between English letter confusability and handshape similarity.

November 21, 2024 at 5:40 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news