Lightnews — Scholar-powered news

Mike Dodds

@m-dodds.bsky.social

I got curious whether Claude Code could handle a low-representation theorem prover like ACL2 - turns out yes! I proved a bunch of small to medium theorem, and for good measure built a MCP server, all in about 4 hrs. I’ve never used ACL2 before. Write-up here: mikedodds.org/posts/2025/1...

Experimenting with ACL2 and Claude Code

TL;DR: Using only prompting with Claude Code, I created: 50+ ACL2 theorem proofs translated from Software Foundations An MCP server for ACL2 with stateful solver sessions

mikedodds.org

October 9, 2025 at 11:48 PM

Mike Dodds

@m-dodds.bsky.social

I wrote about Claude Code, which to my absolute astonishment is quite good at theorem proving. For people who don't know theorem proving, this is like spending your whole life building F1 engines and getting lapped by a Tesco's shopping trolley www.galois.com/articles/cla...

Claude Can (Sometimes) Prove It

www.galois.com

September 16, 2025 at 10:46 PM

Mike Dodds

@m-dodds.bsky.social

New Galois blog: “Specifications Don’t Exist”. If we want to formally verify more systems, we need formal specifications, but most real systems are hard to specify for very deep reasons www.galois.com/articles/spe...

Screenshot of article text: “Formal verification today is very useful, but for most systems it’s very difficult to write the kinds of complete specifications that verification needs. However, other kinds of specifications are popular: for example, a test case is a kind of limited, partial specification. The key is that a test case is immediately useful, and doesn’t impose undue costs on the development team. We need to find ways to specify systems that have these virtues, and avoid the trap of imposing a complete and coherent view that fundamentally does not exist.”

Screenshot of article text: “ I’ve come to think writing formal specifications is just a very difficult task. It requires a top-down view of the system that designers and engineers typically don’t have or, more importantly, need. In contrast, informal specifications can be ambiguous, partial, flexible. Informal specifications are intended as communication mechanisms between humans, and as a result they can be ‘wrong but useful’, and elide aspects of the system that are not of interest. This is a strength, but it also results in systems that can’t be easily formalized.”

Screenshot of article text: “ I think systems are typically both designed from the top and grown incrementally. Most systems have some degree of top-down structure, but few systems have a mathematically coherent specification that covers every behavior. The effect is that most systems obey some formal specification for some core functionality, but if outside this core, we rapidly enter muddy territory where it is unclear what the system should do, or whether the designer should even care”

Screenshot of article text: “ It’s a formal verification cliché that writing the specification tends to uncover most of the bugs in a system. To me, this suggests an analogy between specification and programming—both are tools for expressing what we want. In one way, this is a pessimistic thought: no tool can remove the burden of clarifying our ideas. But also, it gives me some hope. Programming is very difficult, but through careful tool design, we’ve made it available to hundreds of millions of people. With luck and skill, perhaps we can do the same for specifications.”

July 16, 2025 at 12:48 AM

Reposted by Mike Dodds

Hazel Weakly

@hazelweakly.me

I’m not sure how I missed this but it’s an extremely good article and you should absolutely read it. It’s about formal methods, but anyone who cares about integrating research into industry will find it valuable!

I saw a *ton* of parallels with resilience engineering too :)

Susan Potter @susanpotter.net · May 25

Nobody cares about correctness and do cheap things first are great takeaways from this but this article illustrates these and other points especially well: www.galois.com/articles/wha...

What Works (and Doesn't) Selling Formal Methods

www.galois.com

June 25, 2025 at 1:25 AM

Reposted by Mike Dodds

Galois

@galoisinc.bsky.social

At Galois, we often say things like: “Formal methods form the backbone of everything we do.”

But what exactly are formal methods? How do they work, and why are they so important?

We created a handy reference page to explain: www.galois.com/what-are-for...

A labyrinth icon, serving as a metaphor for the process of formal verification

May 27, 2025 at 5:56 PM

Mike Dodds

@m-dodds.bsky.social

New-ish @galoisinc.bsky.social blog: “What Works (and Doesn't) Selling Formal Methods”. The boring truth: engineers are rational and adoption is all about cost/benefit tradeoffs www.galois.com/articles/wha...

A graph of costs and benefits plotted against each other. There is a line under which is “Your favourite under-appreciated formal method”. There are two arrows pointing orthogonally away: “be cheaper” and “be more beneficial”

May 24, 2025 at 2:12 AM

Reposted by Mike Dodds

Galois

@galoisinc.bsky.social

What actually works when selling formal methods in industry?

What doesn't?

The way Galois Principal Scientist @m-dodds.bsky.social sees it, many FM projects don’t pencil out not because clients are irrational, but because the cost/benefit tradeoffs don’t make sense.
www.galois.com/articles/wha...

May 8, 2025 at 4:32 PM

Reposted by Mike Dodds

Galois

@galoisinc.bsky.social

c2rust is available on the Godbolt Compiler Explorer! c2rust is a tool we developed with Immunant that can convert nearly any piece of C code into compilable Rust godbolt.org/z/crsWEGEKM

Compiler Explorer - C (C2Rust (master))

/* Type your code here, or load an example. */ int square(int num) { return num * num; }

godbolt.org

April 14, 2025 at 2:52 PM

Mike Dodds

@m-dodds.bsky.social

Formal methods go great with AI www.wsj.com/articles/why...

Why Amazon is Betting on ‘Automated Reasoning’ to Reduce AI’s Hallucinations

Amazon is using math to help solve one of artificial intelligence’s most intractable problems: its tendency to make up answers, and to repeat them back to us with confidence.

www.wsj.com

February 6, 2025 at 5:21 AM

Mike Dodds

@m-dodds.bsky.social

I wrote about o3, the Frontier Math benchmark, and what it means if AI math keeps getting better

Galois @galoisinc.bsky.social · Jan 29

OpenAI recently announced their new model, o3. Most media attention focused on its impressive results on the ARC-AGI benchmark, but for us at Galois, the most significant result was the model’s 25% score on a benchmark called Frontier Math.

Learn more:
www.galois.com/articles/o3-...

January 29, 2025 at 11:24 PM

Mike Dodds

@m-dodds.bsky.social

Hot take for POPL: the PL community is still mostly in denial about AI. This is bad because PL+AI go great together

- PL can solve the hardest problem with AI - trusting the output it produces

- AI can solve the hardest problem with PL - finding enough engineers who can even use the tools

January 20, 2025 at 9:52 PM

Mike Dodds

@m-dodds.bsky.social

I’m bringing these cute Galois stickers to POPL so if you want one, come find me

January 20, 2025 at 8:09 PM

Reposted by Mike Dodds

Hillel is taking a break

@hillelwayne.com

emojikitchen.dev

Emoji Kitchen - Browse Google's unique emoji combinations

Unique illustrations of combined emoji, cooked up in Google's Emoji Kitchen, and comprehensively available on the web

emojikitchen.dev

December 27, 2024 at 12:45 AM

Mike Dodds

@m-dodds.bsky.social

Re o3 - this is the big one for me. The Frontier Math benchmark is designed to be extremely difficult, and it has a private test set (no data contamination).

Today, o3 is v expensive. But seems inevitable it’ll soon be cheap. If these results hold up, that means MUCH more powerful automated math

December 21, 2024 at 5:19 PM

Mike Dodds

@m-dodds.bsky.social

I gave a talk recently about proof technologies - what people deploy today, what might be available soon, and what seems far off even with fancy AI. Slides here: mikedodds.github.io/files/talks/...

December 17, 2024 at 3:10 AM

Mike Dodds

@m-dodds.bsky.social

I have had conversations with professor types who say “oh I don’t think an LLM will be able solve <whatever> for a long time” and I show them the base ChatGPT model doing <whatever> first time with simple prompting. Many people’s intuitions are stuck (especially LLM critics)

Sam Tobin-Hochstadt @samth.bsky.social · Dec 15

Also I think a lot of people who are hostile to the existence of modern LLMs and hope they will go away aren't aware that you can effectively download and run the original chatgpt on your laptop now, for free.

December 15, 2024 at 5:48 PM

Reposted by Mike Dodds

Sam Tobin-Hochstadt

@samth.bsky.social

I think many of the (quite gross) reactions to this are not grappling yet with how many their students already have what they think is this product in the form of chatgpt.

Kevin A. Bryan @afinetheorem.bsky.social · Dec 13

Super excited to publicly launch "All Day TA" (http://www.alldayta.com), a product @joshgans.bsky.social and I have been working on with our team over the last year. Short version: if you teach in spring, you will want to use this! It's the future of higher education. A short thread: 1/x

December 15, 2024 at 2:32 PM

Mike Dodds

@m-dodds.bsky.social

One of my favourite papers recently: “Verified Cake-Cutting, Faster” arxiv.org/abs/2405.14068

Verifying Cake-Cutting, Faster

Envy-free cake-cutting protocols procedurally divide an infinitely divisible good among a set of agents so that no agent prefers another's allocation to their own. These protocols are highly complex a...

arxiv.org

December 4, 2024 at 9:58 PM

Mike Dodds

@m-dodds.bsky.social

AI personas are getting eerily accurate cc @jmct.bsky.social

December 1, 2024 at 4:41 AM

Reposted by Mike Dodds

Swarat Chaudhuri

@swarat.bsky.social

Since all my Twitter content is now gone, I will start reposting some of it here. Here are the slides for my talk on the coming wave of ML-accelerated formal methods, given at the Isaac Newton Institute last month. May interest some of you.
drive.google.com/file/d/1ybQx...

November 29, 2024 at 2:37 PM

Reposted by Mike Dodds

ionchy

@ionchy.ca

❌ all transpilers are just compilers
✅ all compilers are just transpilers

November 27, 2024 at 12:58 PM

Mike Dodds

@m-dodds.bsky.social

Typical Bluesky post: “I went out on my bike today”

Typical X post: “an AI hacked my social bonding protocol and now Claude is my only friend”

November 26, 2024 at 5:04 AM

Mike Dodds

@m-dodds.bsky.social

I’ve been reading a lot AI / math / formal methods papers, so I made an account @mdai.bsky.social to post them

November 24, 2024 at 7:48 PM

Mike Dodds

@m-dodds.bsky.social

Pretty, pretty, pretty good

A pair of empty chairs in a theatre before a live show, with a sign above it showing Larry David and “Curb Your Enthusiasm”

November 22, 2024 at 3:35 AM

Mike Dodds

@m-dodds.bsky.social

New post: Function Argument Nullability Using an LLM

Writing a static analysis is annoying so what if you just asked an LLM instead? Turns out GPT-4o is good at analysing simple properties. Cheap to build, expensive to run, makes some mistakes. But for some applications, that’s a fine tradeoff

Function Argument Nullability Using an LLM - Galois, Inc.

by Mark Tullsen, Stuart Pernsteiner, and Mike Dodds Overview We think that Rust is a great language, and maybe you agree! Unfortunately, even if you do, there’s a good chance whatever application you’...

galois.com

November 21, 2024 at 10:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news