Alexander Rubinstein
arubique.bsky.social
Alexander Rubinstein
@arubique.bsky.social
PhD student at the University of Tübingen and IMPRS-IS
Pinned
🪩 Evaluate your LLMs on benchmarks like MMLU at 1% cost.

In our new paper, we show that outputs on a small subset of test samples that maximise diversity in model responses are predictive of the full dataset performance.

Project page: arubique.github.io/disco-site/

More below 🧵👇
🪩 Evaluate your LLMs on benchmarks like MMLU at 1% cost.

In our new paper, we show that outputs on a small subset of test samples that maximise diversity in model responses are predictive of the full dataset performance.

Project page: arubique.github.io/disco-site/

More below 🧵👇
October 10, 2025 at 9:42 AM