Lightnews — Scholar-powered news

Mike Dodds

@m-dodds.bsky.social

210 followers 55 following 55 posts

Formal methods nitwit. https://mikedodds.github.io

AI / math / formal methods paper feed: @ai-fm-papers.bsky.social

Posts Replies Media Videos

Mike Dodds

@m-dodds.bsky.social

That’s true, and I think that’s exactly why Claude does so well proofs. It‘s just I happen to know first hand that proofs are a particularly difficult *kind* of program

September 21, 2025 at 1:12 AM

Mike Dodds

@m-dodds.bsky.social

Hey :) Seems like a lot of people moved here from old Twitter and I’m still catching up

June 25, 2025 at 10:51 PM

Mike Dodds

@m-dodds.bsky.social

If a tool is not popular, it’s uncompelling to argue that everyone is just mistaken. At some point you should ask why the tool isn’t useful (at the current cost/benefit point)

Text screenshot: I sometimes hear people claiming that formal methods are demonstrably better than the techniques software engineers mostly use today. The only reason formal techniques aren’t more popular (according to this theory) is that engineering teams are unaware, conservative, maybe put off by superficial difficulties like poor interfaces and documentation. I don’t think this is quite right. My observation is that engineers are mostly rational when thinking about costs and benefits, at least within the bounds of their particular systems and problems.

May 24, 2025 at 2:12 AM

Mike Dodds

@m-dodds.bsky.social

I do think a lot of people are in denial though!

January 21, 2025 at 11:49 PM

Mike Dodds

@m-dodds.bsky.social

I don’t think literally everyone should drop what they’re doing. But my sense is PL research as a whole is significantly under-reacting to AI. So I suppose I think *some more* PL people should bet on AI (but maybe not you!)

January 21, 2025 at 6:20 PM

Mike Dodds

@m-dodds.bsky.social

Happy to mail you a couple. Email me, my address is on my website

January 21, 2025 at 7:15 AM

Mike Dodds

@m-dodds.bsky.social

I think you’ve put your finger on the exact worldview mismatch because 5-10 years seems like an insanely long time horizon to me

January 21, 2025 at 5:07 AM

Mike Dodds

@m-dodds.bsky.social

Why constrain the grammar - just pull more samples and keep the ones that pass :p

January 21, 2025 at 5:02 AM

Mike Dodds

@m-dodds.bsky.social

8 years on, the future is here! xkcd.com/1813/

Vomiting Emoji

xkcd.com

December 27, 2024 at 1:42 AM

Mike Dodds

@m-dodds.bsky.social

If I understand right, the private test set is only used during evaluation of the model - not available to the team doing the training

December 21, 2024 at 10:16 PM

Mike Dodds

@m-dodds.bsky.social

Seems almost certain it’s deliberately trained on math reasoning. The way the o-series models seem to work is by long CoT, with reinforcement learning to impose correct reasoning. Not much public about how o3 works internally, but Chollet has some speculation: arcprize.org/blog/oai-o3-...

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub

OpenAI o3 scores 75.7% on ARC-AGI public leaderboard.

arcprize.org

December 21, 2024 at 5:44 PM

Mike Dodds

@m-dodds.bsky.social

A big jump on coding skill as well:

December 21, 2024 at 5:19 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news