Lightnews — Scholar-powered news

Andreas Madsen

@andreasmadsen.bsky.social

320 followers 170 following 10 posts

Ph.D. in NLP Interpretability from Mila. Previously: independent researcher, freelancer in ML, and Node.js core developer.

Posts Replies Media Videos

Andreas Madsen

@andreasmadsen.bsky.social

FMMs are when models are designed such that measuring faithfulness is cheap and precise, which makes it possible to optimize explanations toward maximum faithfulness.

Diagram of faithfulness measurable models. Showing the model is designed to measure the faithfulness of an explanation, and that this can be used to optimize an explanation.

November 28, 2024 at 2:02 PM

Andreas Madsen

@andreasmadsen.bsky.social

Self-explanations are when LLMs explain themselves. Current models are not capable of this, but we suggest how that could be changed.Diagram of self-explanations. Showing input going in, then the regular output and explanation going out.

Diagram of self-explanations. Showing input going in, then the regular output and explanation going out.

November 28, 2024 at 2:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news