Principal Researcher @ CENTAI.eu | Leading the Responsible AI Team. Building Responsible AI through Explainable AI, Fairness, and Transparency. Researching Graph Machine Learning, Data Science, and Complex Systems to understand collective human behavior. .. more
Principal Researcher @ CENTAI.eu | Leading the Responsible AI Team. Building Responsible AI through Explainable AI, Fairness, and Transparency. Researching Graph Machine Learning, Data Science, and Complex Systems to understand collective human behavior.
The way you conceptualize AI systems affects how you interact with them, do science on them, and create policy and apply laws to them.
Hope you will check it out!
www.science.org/doi/full/10....
#LLMs #AI #Interpretability
Reposted by André Panisson
Presents a framework categorizing MLLM explainability across data, model, and training perspectives to enhance transparency and trustworthiness.
📝 arxiv.org/abs/2412.02104
Reposted by André Panisson
arxiv.org/abs/2411.14257
arxiv.org/abs/2406.04093
Reposted by André Panisson
Reposted by André Panisson
by @norabelrose.bsky.social et al.
An open-source pipeline for finding interpretable features in LLMs with sparse autoencoders and automated explainability methods from @eleutherai.bsky.social.
arxiv.org/abs/2410.13928
📍 “A True-to-the-Model Axiomatic Benchmark for Graph-based Explainers”
🗓️ Tuesday 4–6 PM CET
📌 Poster Session 2, GatherTown
Join us to discuss graph ML explainability and benchmarks
#ExplainableAI #GraphML
openreview.net/forum?id=HSQTv3R8Iz
Reposted by André Panisson
-NeurIPS2024 Communication Chairs
Reposted by André Panisson
How can AI *boost* human decision-making instead of replacing it? We talk about this in our new paper.
doi.org/10.1037/dec0...
#AI #XAI #InterpretableAI #IAI #boosting #competences
🧵👇
Reposted by André Panisson
But then I found the paper "Mechanistic?" by
@nsaphra.bsky.social and @sarah-nlp.bsky.social, which clarified things.
Reposted by Rachel Killean, André Panisson
💙, Mar🐫
openreview.net/forum?id=WCR...
They simplify tuning with k-sparse autoencoders and results show many improvements in explainability. Code, models (not all!) and visualizer included. openreview.net/forum?id=tcs...