Michael Hahn
m-hahn.bsky.social
Michael Hahn
@m-hahn.bsky.social
Prof at Saarland. NLP and machine learning.
Theory and interpretability of LLMs.
https://www.mhahn.info
Reposted by Michael Hahn
LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing?

🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440

🧵[1/n]
June 20, 2025 at 10:03 PM
Reposted by Michael Hahn
New preprint alert! We often prompt ICL tasks using either demonstrations or instructions. How much does the form of the prompt matter to the task representation formed by a language model? Stick around to find out 1/N
May 23, 2025 at 5:38 PM
Chain-of-Thought (CoT) reasoning lets LLMs solve complex tasks, but long CoTs are expensive. How short can they be while still working? Our new ICML paper tackles this foundational question.
May 5, 2025 at 12:25 PM
Reposted by Michael Hahn
Check out our new paper on unlearning for LLMs 🤖. We show that *not all data are unlearned equally* and argue that future work on LLM unlearning should take properties of the data to be unlearned into account. This work was lead by my intern @a-krishnan.bsky.social
🔗: arxiv.org/abs/2504.05058
April 9, 2025 at 1:31 PM
Reposted by Michael Hahn
Our new paper! "Analytic theory of creativity in convolutional diffusion models" lead expertly by @masonkamb.bsky.social
arxiv.org/abs/2412.20292
Our closed-form theory needs no training, is mechanistically interpretable & accurately predicts diffusion model outputs with high median r^2~0.9
December 31, 2024 at 4:54 PM
Reposted by Michael Hahn
Just read this, neat paper! I really enjoyed Figure 3 illustrating the basic idea: Suppose you train a diffusion model where the denoiser is restricted to be "local" (each pixel i only depends on its 3x3 neighborhood N(i)). The optimal local denoiser for pixel i is E[ x_0[i] | x_t[ N(i) ] ]...cont
January 1, 2025 at 2:46 AM