Yucheng Sun
yuchengsun.bsky.social
Yucheng Sun
@yuchengsun.bsky.social
Currently in ETH Zurich. Working on mechanistic interpretability.
1/6: Can we use an LLM’s hidden activations to predict and prevent wrong predictions? When it comes to arithmetic, yes!
I’m presenting new work w/
@alestolfo.bsky.social
“Probing for Arithmetic Errors in LMs” @ #ICML2025 Act Interp WS
🧵 below
July 18, 2025 at 5:22 PM