Sebastian Bordt
sbordt.bsky.social
Sebastian Bordt
@sbordt.bsky.social
Language models and interpretable machine learning. Postdoc @ Uni Tübingen.

https://sbordt.github.io/
Pinned
Have you ever wondered whether a few times of data contamination really lead to benchmark overfitting?🤔 Then our latest #ICML paper about the effect of data contamination on LLM evals might be for you!🚀

Paper: arxiv.org/abs/2410.03249
👇🧵
Reposted by Sebastian Bordt
Here is a formal impossibility result for XAI: Informative Post-Hoc Explanations Only Exist for Simple Functions. I'll give an online presentation about this work next tuesday in @timvanerven.nl 's Theory of Interpretable AI Seminar:

arxiv.org/abs/2508.11441

tverven.github.io/tiai-seminar/
November 7, 2025 at 6:25 AM
Reposted by Sebastian Bordt
🚨 Workshop on the Theory of Explainable Machine Learning

Call for ≤2 page extended abstract submissions by October 15 now open!

📍 Ellis UnConference in Copenhagen
📅 Dec. 2
🔗 More info: sites.google.com/view/theory-...

@gunnark.bsky.social @ulrikeluxburg.bsky.social @emmanuelesposito.bsky.social
Theory of XAI Workshop, Dec 2, 2025
Explainable AI (XAI) is now deployed across a wide range of settings, including high-stakes domains in which misleading explanations can cause real harm. For example, explanations are required by law ...
sites.google.com
September 30, 2025 at 2:01 PM
Reposted by Sebastian Bordt
I am hiring PhD students and/or Postdocs, to work on the theory of explainable machine learning. Please apply through Ellis or IMPRS, deadlines end october/mid november. In particular: Women, where are you? Our community needs you!!!

imprs.is.mpg.de/application
ellis.eu/news/ellis-p...
September 17, 2025 at 6:18 AM
Reposted by Sebastian Bordt
We need new rules for publishing AI-generated research. The teams developing automated AI scientists have customarily submitted their papers to standard refereed venues (journals and conferences) and to arXiv. Often, acceptance has been treated as the dependent variable. 1/
September 14, 2025 at 5:15 PM
Reposted by Sebastian Bordt
This new center strikes the right tone in approaching the AI alignment problem. alignmentalignment.ai
Center for the Alignment of AI Alignment Centers
We align the aligners
alignmentalignment.ai
September 11, 2025 at 8:47 PM
Reposted by Sebastian Bordt
A new recording of our FridayTalks@Tübingen series is online!

How much can we forget about Data Contamination?
by
@sbordt.bsky.social

Watch here: youtu.be/T9Y5-rngOLg
How much can we forget about Data Contamination? - [Sebastian Bordt]
YouTube video by Friday Talks Tübingen
youtu.be
August 29, 2025 at 7:05 AM
Reposted by Sebastian Bordt
The stochastic parrot is now an IMO gold medalist parrot
July 19, 2025 at 8:50 PM
I'm at #ICML in Vancouver this week, hit me up if you want to chat about pre-training experiments or explainable machine learning.

You can find me at these posters:

Tuesday: How Much Can We Forget about Data Contamination? icml.cc/virtual/2025...
July 14, 2025 at 2:49 PM
Reposted by Sebastian Bordt
Our #ICML position paper: #XAI is similar to applied statistics: it uses summary statistics in an attempt to answer real world questions. But authors need to state how concretely (!) their XAI statistics contributes to answer which concrete (!) question!
arxiv.org/abs/2402.02870
During the last couple of years, we have read a lot of papers on explainability and often felt that something was fundamentally missing🤔

This led us to write a position paper (accepted at #ICML2025) that attempts to identify the problem and to propose a solution.

arxiv.org/abs/2402.02870
👇🧵
July 11, 2025 at 7:35 AM
During the last couple of years, we have read a lot of papers on explainability and often felt that something was fundamentally missing🤔

This led us to write a position paper (accepted at #ICML2025) that attempts to identify the problem and to propose a solution.

arxiv.org/abs/2402.02870
👇🧵
July 10, 2025 at 5:58 PM
Have you ever wondered whether a few times of data contamination really lead to benchmark overfitting?🤔 Then our latest #ICML paper about the effect of data contamination on LLM evals might be for you!🚀

Paper: arxiv.org/abs/2410.03249
👇🧵
July 8, 2025 at 6:42 AM
In explainable machine learning, we mostly have negative results for what post-hoc explanations cannot do.

This work presents a surprisingly strong positive result for SHAP, showing that a simple sampling modification allows to reliably detect features that don't influence the model.
Ever aggregated SHAP values across sample points? Our #COLT2025 paper proves that this might be safe when your goal is to discard unimportant features - but only if you add one extra line of code that reshuffles your data! With Robi Bhattacharjee and Karolin Frohnapfel
arxiv.org/abs/2503.23111
How to safely discard features based on aggregate SHAP values
SHAP is one of the most popular local feature-attribution methods. Given a function f and an input x, it quantifies each feature's contribution to f(x). Recently, SHAP has been increasingly used for g...
arxiv.org
May 16, 2025 at 10:59 AM
Reposted by Sebastian Bordt
Finally made it to bluesky as well ...
May 5, 2025 at 8:58 AM
Reposted by Sebastian Bordt
Is the distinction between "aleatoric" and "epistemic" uncertainty practically meaningful (or even well defined) in any real sense? Aleotoric uncertainty refers to irreducible unpredictability (e.g. unrealized randomness in nature) whereas epistemic refers to model uncertainty.
March 20, 2025 at 1:33 PM
I really like coding with LLMs. This week Claude & ChatGPT convinced me my code was too slow. After 2 days of investigation, I think my code is just fine. Never again will I blindly trust you with my profiler logs!🤖
March 18, 2025 at 8:42 PM
I really like the new HTML preview on arxiv, but it somehow handles latex errors differently from PDF. I've been seeing lots of ICML error messages lately.
March 13, 2025 at 11:00 AM
can you draw me a dragon in tikz
February 28, 2025 at 10:50 AM
I just asked aistudio.google.com to write a review for a paper that we will submit to ICML. It's impressive. I believe with this tool, I could produce a mediocre paper review for almost any paper in less than 10 minutes (judged by the standard of reviews that we have at ML conferences).
January 25, 2025 at 12:24 PM
The chain of thought in DeepSeek-R1 is pretty impressive.
January 20, 2025 at 9:35 PM
Reposted by Sebastian Bordt
ICML 2025 has some exciting changes. Here are two of my favorites.

1. Only 1 round of back-and-forth between authors & reviewers. The review process should not be an endless back and forth. It shouldn't be possible to get your paper accepted by exhausting reviewers.
December 20, 2024 at 12:42 AM
Are you interested in data contamination and LLM benchmarks?🤖

Check out our poster today at the NeurIPS ATTRIB workshop (3-4:30pm)!

💡 TL;DR: In the large-data regime, a few times of data contamination matter less than you might think.
December 14, 2024 at 7:53 PM
Reposted by Sebastian Bordt
Recent works have proposed to use publicly available Kaggle competitions to benchmark LLMs, most famously OpenAI's openai.com/index/mle-be....

In this blog, I show how to test LLMs for contamination with Kaggle competitions (of course, there is contamination).

sbordt.substack.com/p/data-conta...
Data Contamination in MLE-bench
How to test language models for prior exposure with tabular datasets
sbordt.substack.com
November 19, 2024 at 5:21 PM
Reposted by Sebastian Bordt
thinking of calling this "The Illusion Illusion"

(more examples below)
December 1, 2024 at 2:33 PM
Reposted by Sebastian Bordt
Check out this starter pack!

go.bsky.app/BYkRryU
November 30, 2024 at 4:13 AM