Lightnews — Scholar-powered news

Maxime Peyrard

@peyrardmax.bsky.social

570 followers 170 following 8 posts

Junior Professor CNRS (previously EPFL, TU Darmstadt) -- AI Interpretability, causal machine learning, and NLP. Currently visiting @NYU

https://peyrardm.github.io

Posts Replies Media Videos

Maxime Peyrard

@peyrardmax.bsky.social

We find a lot of identifiability issues:
- Multiple explanatory algorithms exists
- Even for one algorithm, there are many localizations in the network

Identifiability problems remain across scenarios: changing levels of over-parametrization, progress in training, multi-tasks, model size.

April 21, 2025 at 1:52 PM

Maxime Peyrard

@peyrardmax.bsky.social

Mechanistic Interpretability aims to produce statements like: "Model M solves task T by doing X."
To do so, many causal manipulations are performed to validate an explanation. But what if (many) other, incompatible explanations also pass the causal tests?

Illustration of different strategies for mechanistic interpretability

April 21, 2025 at 1:52 PM

Maxime Peyrard

@peyrardmax.bsky.social

Our paper "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" will be presented at #ICLR2025!
It's also the first paper of my first PhD student, congrats @maximemeloux.bsky.social ! 🎉

blog: melouxm.github.io/MI-identifia...

An explanatory thread 🧵:

April 21, 2025 at 1:52 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news