Maxime Peyrard
peyrardmax.bsky.social
Maxime Peyrard
@peyrardmax.bsky.social
Junior Professor CNRS (previously EPFL, TU Darmstadt) -- AI Interpretability, causal machine learning, and NLP. Currently visiting @NYU

https://peyrardm.github.io
We find a lot of identifiability issues:
- Multiple explanatory algorithms exists
- Even for one algorithm, there are many localizations in the network

Identifiability problems remain across scenarios: changing levels of over-parametrization, progress in training, multi-tasks, model size.
April 21, 2025 at 1:52 PM
Mechanistic Interpretability aims to produce statements like: "Model M solves task T by doing X."
To do so, many causal manipulations are performed to validate an explanation. But what if (many) other, incompatible explanations also pass the causal tests?
April 21, 2025 at 1:52 PM
Our paper "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" will be presented at #ICLR2025!
It's also the first paper of my first PhD student, congrats @maximemeloux.bsky.social ! 🎉

blog: melouxm.github.io/MI-identifia...

An explanatory thread 🧵:
April 21, 2025 at 1:52 PM