Theory and interpretability of LLMs.
https://www.mhahn.info
🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440
🧵[1/n]
🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440
🧵[1/n]
🔗: arxiv.org/abs/2504.05058
🔗: arxiv.org/abs/2504.05058
arxiv.org/abs/2412.20292
Our closed-form theory needs no training, is mechanistically interpretable & accurately predicts diffusion model outputs with high median r^2~0.9
arxiv.org/abs/2412.20292
Our closed-form theory needs no training, is mechanistically interpretable & accurately predicts diffusion model outputs with high median r^2~0.9