Ben Recht
banner
beenwrekt.bsky.social
Ben Recht
@beenwrekt.bsky.social
Over on the Gelman blog, there's a post about this histogram of z-scores in published papers. I'm always baffled by the discourse around it. Why is this plot a symptom of a problem? Under what model would we expect this to be a bell curve?
November 15, 2025 at 3:31 PM
A brief introduction to adaptive experimentation without the words "exploration-exploitation tradeoff," "multi-armed bandit," or "reinforcement learning."
How to pick a sample size.
A brief introduction to adaptive experimentation without saying "exploration-exploitation tradeoff."
www.argmin.net
November 13, 2025 at 3:53 PM
Following up on Monday’s discussion, I articulate a few concrete positions on archives, surveys, and position papers.
The DOI Directorate
Articulating a few concrete positions on archives, surveys, and position papers
www.argmin.net
November 12, 2025 at 3:30 PM
Based on a fun conversation on here, I wrote about the arXiv position paper controversy and the weird, unwritten, organic evolution of academic practice.
A position on positions
The complex evolution of academic process doesn't always lead us to better practice.
www.argmin.net
November 10, 2025 at 3:42 PM
Kevin O’Connell passed the down 8 test ✅✅✅
November 9, 2025 at 9:36 PM
The Texans went for 2 down 13 in the fourth quarter and won. What do the analytics say about that one?
November 9, 2025 at 9:34 PM
I'm still so confused by this whole dustup.

Why should peer-reviewed papers be posted to pre-print servers?

When did everyone become so committed to churning out position papers?

What happened to the public_html/ directory?
We are not "banning" reviews; we are just requiring peer review first. Good review articles are important for the field!
You can’t really blame arXiv for the decision to stop publishing computer science stuff (given the flood of slop) but this is also a textbook example of a global public good being gratuitously degraded www.nature.com/articles/d41...
November 8, 2025 at 7:49 PM
Everyone knows actions are fundamentally different than predictions, but it's hard to write this distinction in math.
Staging Interventions
Actions are fundamentally different than predictions, but it's hard to write this distinction in math.
www.argmin.net
November 6, 2025 at 3:30 PM
No matter what the models say, one person's statistical fact is another's statistical outlier. (Part 7 in the going-for-2-down-8 microcosm)
Learning from losers
Games are always a microcosm, and that's why I'm hooked on writing about sportsball.
www.argmin.net
November 5, 2025 at 3:30 PM
A machine learning view of the randomized controlled trial, as a bridge from patterns to actions.
Instrumentalized Actuarial Predictions
The randomized controlled trial as a natural extension of machine learning
www.argmin.net
November 4, 2025 at 3:30 PM
alright... so...

joe flacco throws a pick 6
it gets called back on BS grounds
bengals score TD to go down 8
go for 2 and make it
get the onside kick on the most sketchy foot contact
get another TD to go up 1
forget to tackle the bears tight end who scores
flacco throws a pick
bengals lose
November 2, 2025 at 9:40 PM
oh my
OK, I have an analytics post drafted, but there is still plenty of time left on the clock for Joe Flacco.
November 2, 2025 at 9:37 PM
OK, I have an analytics post drafted, but there is still plenty of time left on the clock for Joe Flacco.
November 2, 2025 at 9:34 PM
Ah, it's becoming even clearer! Probabilities change if you know the future.
Decision analysis on the Jets go for 2, down 8 yesterday

WP Go for 2: 12.3%
WP PAT: 11.2%

BUT KEEP IN MIND, the WP difference would be much larger if we knew at that moment Jets would score another TD and hold Bengals -- which is the only world where this matters. That's why it's a big deal!
October 27, 2025 at 3:36 PM
How two mathematicians resolved a 50-year-old open problem by finding the solution in an 80-year-old paper. Human knowledge is unsearchable.
The fine art of crate digging
Alexeev and Mixon's resolution of Erdős problem 707 and the vastness of the library.
www.argmin.net
October 27, 2025 at 3:17 PM
The Jets went for 2 down 8 and won! This changes everything.
October 26, 2025 at 8:21 PM
This paper is a delight along multiple axes. It describes an Erdős problem that was solved 30 years before Erdős declared it open. arxiv.org/abs/2510.19804
Forbidden Sidon subsets of perfect difference sets, featuring a human-assisted proof
We resolve a $1000 Erdős prize problem, complete with formal verification generated by a large language model. In over a dozen papers, beginning in 1976 and spanning two decades, Paul Erdős repeated...
arxiv.org
October 25, 2025 at 3:13 PM
It is maddeningly frustrating to try to explain the incoherent assemblage of heuristics that is reinforcement learning.
October 25, 2025 at 2:28 PM
A great rant by hater/legend @drewmagary.bsky.social about why sports is the best when it's improbable. Also, appreciate the shoutout to my anti-analytics analytics.
October 24, 2025 at 5:38 PM
Revisiting last week’s open problems scandal, I wrote about LLMs as Lore Laundering Machines and why some are blind to the novelty whitewashing.
Lore Laundering Machines
When insight comes from willful forgetting.
www.argmin.net
October 24, 2025 at 2:28 PM
Conformal prediction in a tweet: predict an empirical quantile instead of an empirical mean. Trustworthy AI, here we come.
Maybe You're Wrong
Quantiles, Prediction Intervals, and what theory can tell you about the future.
www.argmin.net
October 23, 2025 at 2:32 PM
The list of signatories on the latest "Please Skynet, don't kill us" letter is BONKERS.
October 22, 2025 at 5:54 PM
I am unironically writing a book about this.
I am writing a paper "Why small samples are all you need, small effects are really important, you do not need to correct for multiple comparisons, and invalid measures are useful'. But because I really believe it, and really not just for the attention and citations, I swear. 🙄
October 21, 2025 at 7:44 PM
Randomization is a powerful algorithmic tool, but its value strongly depends on context. Just compare quicksort to confidence intervals.
You're probably right
The value of randomized algorithms depends on verification.
www.argmin.net
October 21, 2025 at 2:31 PM
PS The title of today’s post is inspired by this must-read classic by Stark and Freedman.
October 20, 2025 at 7:05 PM