Dan Butler
danieljbutler.bsky.social
Dan Butler
@danieljbutler.bsky.social
interests: software, neuroscience, causality, philosophy | ex: salk institute, u of washington, MIT | djbutler.github.io
It would be fun for a benchmark to focus on problems that are more "visual" - truths that are easy for humans to "see" but hard for them to prove formally
July 23, 2025 at 11:36 AM
Isn't natural language still awfully close to a formal / symbolic domain? Human mathematical intuition seems grounded in spatiotemporal relationships, not natural language.
July 22, 2025 at 1:20 PM
It could be called "turbulence"
July 17, 2025 at 11:09 AM
Length of chain of thought does indeed correlate to difficulty - see attached
June 29, 2025 at 9:14 AM
I’m genuinely confused by these statements. Chain of thought length absolutely does correlate to difficulty - generally the LLM will stop thinking when it reached a reasonable answer. Likewise in human reasoning!
June 29, 2025 at 8:32 AM
The number of tokens doesn't necessarily stay the same, does it? LLMs can execute algorithms and output the stored values at intermediate steps as tokens, so the number of tokens / amount of computation scales up with the difficulty of the problem (size of the input, in the case of factorization)
June 28, 2025 at 11:09 AM
But isn’t it just a constant amount of compute per token? Producing more tokens involves using more time and space. Chain of thought, etc.
June 28, 2025 at 10:47 AM
By contrast, good explanatory scientific theories generalize to broader set of "perturbations" than just the types of experiments that went into constructing the theory. Watson and Crick's model of DNA was not just a way to predict x-ray diffraction patterns.
June 25, 2025 at 11:35 PM
Totally right, you said something different. You're much more pro- this type of model learned from perturbation data.

My concern is that you end up with a causal model, yes - but the perturbations are drawn from a very constrained distribution. The ML model can more or less memorize them.
June 25, 2025 at 11:20 PM
Also notable that this type of work doesn't use any of the conditional independence assumptions that are common in the causal modeling community @alxndrmlk.bsky.social
June 24, 2025 at 5:27 PM
@kordinglab.bsky.social argued in a recent talk that you can't learn a model from canned data that will let you simulate perturbation experiments.

bsky.app/profile/kemp...

But this type of model seems darn close.
June 24, 2025 at 5:25 PM
Philip did mention a MW talk from Zurek I think
June 14, 2025 at 1:55 PM
Do we know if the number of steps they can perform is related to how many steps they saw in their training data? Can RL fine-tuning increase the number of steps?
June 10, 2025 at 11:52 AM
Does anyone know what species this is? Would love to know more about what structures play the role of nervous system and muscles
June 8, 2025 at 10:04 PM
Who knew that Chargaff was into this stuff
June 8, 2025 at 2:03 PM
Things that aren’t chocked full of information-bearing molecules
June 7, 2025 at 5:31 PM
Because the kind of theories we want involve phenomena that span 3-4 orders of magnitude in space (synapses vs. brains) and 6-7 orders of magnitude in time (action potentials vs. skill acquisition)?
June 7, 2025 at 2:40 PM
There’s a good definition of computational universality (Church-Turing) - why couldn’t there be one of general intelligence?
May 30, 2025 at 1:14 PM
If constructor theory told us something amazing *was* constructible, it might help motivate us to build it.

Conversely we could avoid wasting our time on things not even constructible in principle.
May 25, 2025 at 4:17 PM
Quiet posters feed. You’re welcome.
May 24, 2025 at 8:13 PM