lovisheindrich.bsky.social
@lovisheindrich.bsky.social
Reposted
New paper alert! 🚨

Important question: Do SAEs generalise?
We explore the answerability detection in LLMs by comparing SAE features vs. linear residual stream probes.

Answer:
probes outperform SAE features in-domain, out-of-domain generalization varies sharply between features and datasets. 🧵
March 1, 2025 at 6:14 PM