🛠️ Xplique library development team member.
How can we compare concept-based #XAI methods in #NLProc?
ConSim (arxiv.org/abs/2501.05855) provides the answer.
Read the thread to find out which method is the most interpretable! 🧵1/7
🙏 Thanks again to my amazing co-authors: @alon_jacovi, Agustin Picard, @VictorBoutin, and @Fannyjrd_.
Work done in DEEL and FOR from IRT St Exupéry and @ANITI_Toulouse.
See you in Vienna 📅
For more information, check out my last post:
There's other "Smart" add-ons as well, but that's the one that reads your content.
There's other "Smart" add-ons as well, but that's the one that reads your content.
𝗔𝗻 𝗶𝗻𝘁𝗲𝗿𝗽𝗿𝗲𝘁𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗱𝗲𝗲𝗽 𝗱𝗶𝘃𝗲 𝗶𝗻𝘁𝗼 𝗗𝗜𝗡𝗢𝘃𝟮, one of vision’s most important foundation models.
And today is Part I, buckle up, we're exploring some of its most charming features. :)
𝗔𝗻 𝗶𝗻𝘁𝗲𝗿𝗽𝗿𝗲𝘁𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗱𝗲𝗲𝗽 𝗱𝗶𝘃𝗲 𝗶𝗻𝘁𝗼 𝗗𝗜𝗡𝗢𝘃𝟮, one of vision’s most important foundation models.
And today is Part I, buckle up, we're exploring some of its most charming features. :)
This is my first big conference!
📅 Tuesday morning, 10:30–12:00, during Poster Session 2.
💬 If you're around, feel free to message me. I would be happy to connect, chat, or have a drink!
This is my first big conference!
📅 Tuesday morning, 10:30–12:00, during Poster Session 2.
💬 If you're around, feel free to message me. I would be happy to connect, chat, or have a drink!
Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?
Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?
🙏 Thanks again to my amazing co-authors: @alon_jacovi, Agustin Picard, @VictorBoutin, and @Fannyjrd_.
Work done in DEEL and FOR from IRT St Exupéry and @ANITI_Toulouse.
See you in Vienna 📅
For more information, check out my last post:
How can we compare concept-based #XAI methods in #NLProc?
ConSim (arxiv.org/abs/2501.05855) provides the answer.
Read the thread to find out which method is the most interpretable! 🧵1/7
🙏 Thanks again to my amazing co-authors: @alon_jacovi, Agustin Picard, @VictorBoutin, and @Fannyjrd_.
Work done in DEEL and FOR from IRT St Exupéry and @ANITI_Toulouse.
See you in Vienna 📅
For more information, check out my last post:
Happy to be part of the organizing team for this year, and super excited for our new shared task using the excellent MIB Benchmark, check it out! blackboxnlp.github.io/2025/task/
This edition will feature a new shared task on circuits/causal variable localization in LMs, details here: blackboxnlp.github.io/2025/task
Happy to be part of the organizing team for this year, and super excited for our new shared task using the excellent MIB Benchmark, check it out! blackboxnlp.github.io/2025/task/
> Follow @actinterp.bsky.social
> Website actionable-interpretability.github.io
@talhaklay.bsky.social @anja.re @mariusmosbach.bsky.social @sarah-nlp.bsky.social @iftenney.bsky.social
Paper submission deadline: May 9th!
> Follow @actinterp.bsky.social
> Website actionable-interpretability.github.io
@talhaklay.bsky.social @anja.re @mariusmosbach.bsky.social @sarah-nlp.bsky.social @iftenney.bsky.social
Paper submission deadline: May 9th!
The ‘justification’ is campus activism or social media posts.
timesofindia.indiatimes.com/world/us/hun...
The ‘justification’ is campus activism or social media posts.
timesofindia.indiatimes.com/world/us/hun...
📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...
Featuring basic multi-step reasoning, planning, introspection and more!
📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...
Featuring basic multi-step reasoning, planning, introspection and more!
You would expect this in a dictatorship, not the United States.
This country is unrecognizable.
You would expect this in a dictatorship, not the United States.
This country is unrecognizable.
Read our NSF/OSTP recommendations written with Goodfire's Tom McGrath tommcgrath.github.io, Transluce's Sarah Schwettmann cogconfluence.com, MIT's Dylan Hadfield-Menell @dhadfieldmenell.bsky.social
TLDR; Dominance comes from **interpretability** 🧵 ↘️
Read our NSF/OSTP recommendations written with Goodfire's Tom McGrath tommcgrath.github.io, Transluce's Sarah Schwettmann cogconfluence.com, MIT's Dylan Hadfield-Menell @dhadfieldmenell.bsky.social
TLDR; Dominance comes from **interpretability** 🧵 ↘️
It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.
Details in 🧵
It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.
Details in 🧵
We will be working on a new library for interpretability 😀
We will be working on a new library for interpretability 😀
How can we compare concept-based #XAI methods in #NLProc?
ConSim (arxiv.org/abs/2501.05855) provides the answer.
Read the thread to find out which method is the most interpretable! 🧵1/7
How can we compare concept-based #XAI methods in #NLProc?
ConSim (arxiv.org/abs/2501.05855) provides the answer.
Read the thread to find out which method is the most interpretable! 🧵1/7
#XAI for #NLProc under the supervision of Pr. Nicholas Asher,
@philmuller.bsky.social, and @fannyjrd.bsky.social!
My project? Improve the transparency of LLMs through interactive explanations and user-tailored explanations. 🚀
#XAI for #NLProc under the supervision of Pr. Nicholas Asher,
@philmuller.bsky.social, and @fannyjrd.bsky.social!
My project? Improve the transparency of LLMs through interactive explanations and user-tailored explanations. 🚀