Michael Ritchot
banner
ritchot.me
Michael Ritchot
@ritchot.me
International Educator | Teacher | Coach | Re-thinking Education, Society, Work, and Life through Technology | M.Ed. in Education Technology & Instructional Design

https://ritchot.me/
https://www.linkedin.com/in/mritchot
It is a good reminder to build your own, reproducible benchmarks on your own tasks, instead of relying on vendor benchmarks or marketing hype.

None of this is to say year-over-year model progress has not been genuinely impressive. Just be cautious, and run your own tests.
November 26, 2025 at 3:09 AM
Whether it is Gemini supposedly solving full math problems with work with NanoBanana Pro or generating accurate isometric and perspective views directly from floor plans (literally the work I am doing right now—it makes mistakes a lot and is not consistent at all).
November 26, 2025 at 3:09 AM
In the piece I walk through the setup, limitations, and what this means for homework, in-class work, and “authentic” tasks in secondary math.
November 23, 2025 at 12:01 PM
At that level, standard take-home math is no longer AI-safe in any meaningful way. As a classroom teacher, that matters more to me than another benchmark. If a student wants to outsource their worksheet, friction is low and errors will not expose them quickly.
November 23, 2025 at 12:01 PM
Short version: o1 pro only hit the 4/4 reliability bar on 40/60 questions (67%) and got 177/240 attempts correct (74%). Under essentially the same framework, GPT-5 hit 48/50 questions at 4/4 and 392/400 attempts (98%). We moved from 1-in-4 errors to about 1-in-50.
November 23, 2025 at 12:01 PM
I wrap up with an idea on what I would have preferred to see from the study. Read more below:

ritchot.me/on-writing-a...
on writing, and an MIT study
\ When a thought crosses my mind, or I start consistently coming across a topic that I may find value in writing about later, I typically store these...
ritchot.me
June 25, 2025 at 11:01 PM
I feel there were several oddities about the paper, and that discourse surrounding the paper is misinterpreting it at best.

Short version: I am incredibly happy that the author's made the paper so publicly available, but they made several odd decisions.
June 25, 2025 at 11:01 PM
Google's AI co-scientist is a promising example, even generating novel ideas and reinforcing the trend of AI-human collaboration in advancing knowledge fields.

research.google/blog/acceler...
Accelerating scientific breakthroughs with an AI co-scientist
research.google
February 20, 2025 at 6:31 AM
Why not just errors? I will never understand the time spent on this discourse.
February 19, 2025 at 12:40 PM
This is the first of what will likely be a few pieces (with any follow-ups from here being a bit more in depth and more specific) exploring how these shifts will redefine education, work, and society in ways we’re only beginning to grasp.

ritchot.me/some-thought...
some thoughts on emergent technology and the future of education
\ We often envision the future of technology by projecting today’s society forward, rather than considering how fundamentally different it might become...
ritchot.me
February 18, 2025 at 10:32 AM
Likely still plenty of debate on how to get students there, and shifting them up bloom's hierarchy.
brAIn drAIn
The enhancement and atrophy of human cognition go hand in hand
www.theintrinsicperspective.com
February 14, 2025 at 12:53 AM