Mike Hearn
mikehearn.bsky.social
Mike Hearn
@mikehearn.bsky.social
Tricking LLMs with the "counting letters" prompt is like showing humans an optical illusion and then, when the human perceives it incorrectly, using it as evidence that humans aren't intelligent. It targets a specific blind spot in how we operate but isn't really representative of anything else.
We have a lot of fun tripping up AI with this, but asking it to parse a word by individual letters is kind of a nonsensical question given how tokenizers operate. It's like asking a Chinese speaker how many G's are in 中国, that's not how they process language.
August 8, 2025 at 12:42 PM
OpenAI should just automatically enable thinking for this dumb question that only exists trick LLMs.
August 8, 2025 at 12:12 PM
A lot of angry and upset people in this thread, but almost no one seems to understand the specifics of what they're angry about. This reporter asked ChatGPT to write about something in her own voice, and it did (privately, just to her). WaPo has absolutely nothing to do with this.
When I left the Washington Post in 2023, my colleagues and I were worried AI would hoover up our work and enable "zombies" to churn out stories based on our reporting and writing styles.

Today WaPo announced a partnership with OpenAI. And here it is: Zombie Gillian Brockell. I DID NOT WRITE THIS.
April 23, 2025 at 5:15 AM
There's also been a clear external change in the way insiders are tweeting and talking about the future/AGI (mostly OpenAI folks). It seems like the reasoning models (o*) must have unlocked something.
There has been a definite shift in recent weeks where insiders in the various AI labs are suggesting that very intelligent AIs are coming very soon.

I wrote a bit about why this might be happening and what we can take away from their apparent confidence. www.oneusefulthing.org/p/prophecies...
Prophecies of the Flood
What to make of the statements of the AI labs?
www.oneusefulthing.org
January 10, 2025 at 7:59 PM
I'll save everyone time and just tell you that the similarities are that they are both hyped and have weird corporate structures, and that's about it.
Who is up for a short thread on the intriguing parallels between OpenAI and Enron, as told by one of Enron's investment bankers -- that is to say, me? Here goes...
January 6, 2025 at 10:20 PM
OpenAI announced the $200 plan 12 days before announcing o3-mini, a model comparable to o1 at 1/10th of the cost. It's a foregone conclusion that the cost of intelligence is going to go down, so losing money in the very short term means nothing for their future profitability.
January 6, 2025 at 2:52 PM
If we're just adding football games at all levels, why not, after both NFL and CFB seasons are over, the CFB champion plays the NFL team with the #1 draft pick. Pay per view. Who says no?
January 3, 2025 at 12:47 AM
No one cares, not even me, but I asked o1 Pro (currently the smartest — nay, wisest — LLM) to rule on the TikTok ban currently before the Supreme Court and it threw out the ban on 1A grounds.
December 30, 2024 at 9:55 PM
@raycast.com Are o1 or the new Gemini models coming to the advanced AI subscription? Feels like those two places are where the current advances have been happening for the last few months. Also Deepseek's release today seems impressive. Just generally hoping for updates.
December 26, 2024 at 9:26 PM
I updated my extremely non-rigorous NYT Connections LLM Benchmark page with the new Flash 2.0 Thinking model. Turns out that the new "thinking" model isn't much better at Connections than Claude or Gemini 2.0 Advanced. It got 1/15 whereas o1 Pro went 14/15. mikehearn.notion.site/155c9175d234...
AI Does NYT Connections | Notion
Measuring LLM intelligence through the New York Times “Connections” game. ✅ = Correct group with the correct connection ⚠️ = Correct group with the wrong connection (still counts as a “win”) ❌ = Inco...
mikehearn.notion.site
December 20, 2024 at 3:42 PM
Google isn't exactly dead but I do think there will come a day where manually sifting through web pages to find a specific piece of information will be as antiquated as going to the library to do research.
BREAKING 🚨: OpenAI announced Search improvements for ChatGPT during the 8th Day 👀

- Partnership with Foursquare
- Search for free users
- Search in voice mode 🔥
- Speed improvements for fast queries
- Upgraded mobile search experience (new widgets)

* Gradually rolling out now
December 16, 2024 at 6:37 PM
I stopped benchmarking NYT Connections against the top 4 LLMs because it was such a rout. From Nov. 26 - Dec. 10, o1 Pro correctly solved 14/15 puzzles. o1 solved 10/15. The non-reasoning models just aren't built for it. Claude went 1/15, Gemini-1206 went 0/15.

mikehearn.notion.site/155c9175d234...
AI Does NYT Connections | Notion
Measuring LLM intelligence through the New York Times “Connections” game. ✅ = Correct group with the correct connection ⚠️ = Correct group with the wrong connection (still counts as a “win”) ❌ = Inco...
mikehearn.notion.site
December 16, 2024 at 3:52 PM
I think my new thing is going to be racing ChatGPT's o1 model in NYT Connections.
December 6, 2024 at 1:28 AM
What exactly is Moore doing here
November 30, 2024 at 10:40 PM
Darnold looking back at the ref being like, are you seeing this shit
October 25, 2024 at 3:28 AM
I agree that domain verification ≠ actual verification, but one interesting aspect of the domain system is that it's possible for third-parties to take on the load of "verification" by assigning notable people a subdomain and doing the legwork of verifying they are who they say they are.
As BlueSky gets "the juice" it is going to need verification in my very humble opinion. @jay.bsky.team @pfrazee.com
October 23, 2024 at 8:37 PM
Being on Blue Sky now is kind of liberating because back on the ol' X platform, or whatever, Alan Menken accidentally followed me years ago and from that moment I vowed to never disappoint Alan Menken. In practice what that meant was I never tweeted again.
October 21, 2024 at 2:34 PM
I flipped a coin 10 times and only 4 were heads but then I flipped it 10 more times and 6 were heads. I had to hand it to heads, an incredible turnaround.
When I saw them at Citi field on Memorial Day they were unwatchable. They pulled themselves out of a nosedive and made it to within two wins of the World Series.
god what a great season for the mets
October 21, 2024 at 2:22 PM