sambowyer.bsky.social
@sambowyer.bsky.social
Reposted
Our paper on the best way to add error bars to LLM evals is on arXiv! TL;DR: Avoid the Central Limit Theorem -- there are better, simple Bayesian and frequentist methods you should be using instead.

We also provide a super lightweight library: github.com/sambowyer/baye… 🧵👇
March 6, 2025 at 3:00 PM