Lightnews — Scholar-powered news

Andrew Lee

@ajyl.bsky.social

720 followers 600 following 42 posts

Post-doc @ Harvard. PhD UMich. Spent time at FAIR and MSR. ML/NLP/Interpretability

Posts Replies Media Videos

Andrew Lee

@ajyl.bsky.social

Question @neuripsconf.bsky.social
- a coauthor had his reviews re-assigned many weeks ago. The ACs of those papers told him "i've been told to tell u: leave a short note. You won't be penalized". Now I'm being warned of desk-reject due to his short/poor reviews. What's the right protocol here?

July 4, 2025 at 8:56 PM

Reposted by Andrew Lee

nikhil07prakash.bsky.social

@nikhil07prakash.bsky.social

How do language models track mental states of each character in a story, often referred to as Theory of Mind?

We reverse-engineered how LLaMA-3-70B-Instruct handles a belief-tracking task and found something surprising: it uses mechanisms strikingly similar to pointer variables in C programming!

June 24, 2025 at 5:13 PM

Reposted by Andrew Lee

Lihao Sun

@1e0sun.bsky.social

🚨New #ACL2025 paper!

Today’s “safe” language models can look unbiased—but alignment can actually make them more biased implicitly by reducing their sensitivity to race-related associations.

🧵Find out more below!

June 10, 2025 at 2:39 PM

Andrew Lee

@ajyl.bsky.social

🚨New preprint!

How do reasoning models verify their own CoT?
We reverse-engineer LMs and find critical components and subspaces needed for self-verification!

1/n

May 13, 2025 at 6:52 PM

Andrew Lee

@ajyl.bsky.social

🚨New Preprint! Did you know that steering vectors from one LM can be transferred and re-used in another LM? We argue this is because token embeddings across LMs share many “global” and “local” geometric similarities!

May 7, 2025 at 1:38 PM

Reposted by Andrew Lee

David Bau

@davidbau.bsky.social

Today we launch a new open research community

It is called ARBOR:
arborproject.github.io/

please join us.
bsky.app/profile/ajy...

February 20, 2025 at 10:15 PM

Andrew Lee

@ajyl.bsky.social

Excited about recent reasoning models? What is happening under the hood?
Join ARBOR: Analysis of Reasoning Behaviors thru *Open Research* - a radically open collaboration to reverse-engineer reasoning models!
Learn more: arborproject.github.io
1/N

ARBOR

arborproject.github.io

February 20, 2025 at 7:55 PM

Andrew Lee

@ajyl.bsky.social

New paper <3
Interested in inference-time scaling? In-context Learning? Mech Interp?
LMs can solve novel in-context tasks, with sufficient examples (longer contexts). Why? Bc they dynamically form *in-context representations*!
1/N

January 5, 2025 at 3:49 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news