Millicent Li
millicentli.bsky.social
Millicent Li
@millicentli.bsky.social
CS PhD Student @ Northeastern, former ugrad @ UW, UWNLP --
https://millicentli.github.io/
Reposted by Millicent Li
What's the right unit of analysis for understanding LLM internals? We explore in our mech interp survey (a major update from our 2024 ms).

We’ve added more recent work and more immediately actionable directions for future work. Now published in Computational Linguistics!
October 1, 2025 at 2:03 PM
Wouldn’t it be great to have questions about LM internals answered in plain English? That’s the promise of verbalization interpretability. Unfortunately, our new paper shows that evaluating these methods is nuanced—and verbalizers might not tell us what we hope they do. 🧵👇1/8
September 17, 2025 at 7:19 PM