Max Kleiman-Weiner
maxkw.bsky.social
Max Kleiman-Weiner
@maxkw.bsky.social
professor at university of washington and founder at csm.ai. computational cognitive scientist. working on social and artificial intelligence and alignment.
http://faculty.washington.edu/maxkw/
New paper challenges how we think about Theory of Mind. What if we model others as executing simple behavioral scripts rather than reasoning about complex mental states? Our algorithm, ROTE (Representing Others' Trajectories as Executables), treats behavior prediction as program synthesis.
October 3, 2025 at 5:01 AM
Excited by our new work estimating the empowerment of LLM-based agents in text and code. Empowerment is the causal influence an agent has over its environment and measures an agent's capabilities without requiring knowledge of its goals or intentions.
October 1, 2025 at 4:27 AM
Finally, when we tested it against memory-1 strategies (such as TFT and WSLS) in the iterated prisoner's dilemma, the Bayesian Reciprocator: expanded the range where cooperation is possible and dominated prior algorithms using the *same* model across simultaneous & sequential games.
July 22, 2025 at 6:04 AM
Even in one-shot games with observability, the Bayesian Reciprocator learns from observing others' interactions and enables cooperation through indirect reciprocity
July 22, 2025 at 6:04 AM
In dyadic repeated interactions in the Game Generator, the Bayesian Reciprocator quickly learns to distinguish cooperators from cheaters, remains robust to errors, and achieves high population payoffs through sustained cooperation.
July 22, 2025 at 6:04 AM
Instead of just testing on repeated prisoners' dilemma, we created a "Game Generator" which creates infinite cooperation challenges where no two interactions are alike. Many classic games, like the prisoner’s dilemma or resource allocation games, are just special cases.
July 22, 2025 at 6:04 AM
It uses theory of mind to infer the latent utility functions of others through Bayesian inference and an abstract utility calculus to work across ANY game.
July 22, 2025 at 6:04 AM
We introduce the "Bayesian Reciprocator," an agent that cooperates with others proportional to its belief that others share its utility function.
July 22, 2025 at 6:04 AM
Settling in for my flight and apparently A.I. DOOM is now a movie genre between Harry Potter and Classics. Nothing better than an existential crisis with pretzels and a ginger ale.
June 29, 2025 at 10:52 PM
LLMs learn beliefs and values from human data, influence our opinions, and then reabsorb those influenced beliefs, feeding them back to users again and again. We call this the "Lock-In Hypothesis" and develop theory, simulations, and empirics to test it in our latest ICML paper!
June 9, 2025 at 8:23 PM
Emergent transition from code to natural language for reasoning tasks when RL tuning a language model for math. Interesting to consider implications for "Language of Thought" style theories in cognition.

hkust-nlp.notion.site/simplerl-rea...
January 26, 2025 at 6:32 AM
Josh Tenenbaum on scaling up vs growing up and the path to human-like reasoning #NeurIPS2024
December 15, 2024 at 6:14 PM
So much fun working with wonderful co-first authors:
Zhijing Jin and Giorgio Piatti and collaborators Sydney Levine, Jiarui Liu, Fernando Gonzalez, Francesco Ortu, András Strausz, @mrinmaya.bsky.social, Rada Mihalcea, Yejin Choi, Bernhard Schölkopf
December 14, 2024 at 5:40 PM
Only at NeurIPS
December 10, 2024 at 7:27 PM
We benchmark 13 LLMs in GovSim. Overall, we find that this is quite challenging for LLMs! Even the best performing model (GPT-4o) survives the first 12 rounds less than 60% of the time. Most never survive.
December 5, 2024 at 5:03 PM
Over the past 40 years, this question has received extensive empirical attention, culminating in the 2009 Nobel Prize to Elinor Ostrom for characterizing the many mechanisms people and communities use to cooperate sustainably from the bottom up.
December 5, 2024 at 5:03 PM
Excited that our multi-agent LLM agent work, “Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents,” will be presented at #NeurIPS24 -- reach out if you want to meet up in Vancouver!
December 5, 2024 at 5:03 PM
Github Copilot output. A sad but fascinating alignment failure. Reveals hidden LLM biases by going out of their RLHF distribution
November 26, 2024 at 4:12 AM
I’m recruiting PhD students to join the Computational Minds and Machines Lab at the University of Washington in Seattle! Join us to work at the intersection of computational cognitive science and AI with a broad focus on social intelligence. (Please reshare!)
November 15, 2024 at 5:01 PM
These videos of a humanoid watching itself in a mirror are a cool test of a “self” model. The robot’s world model tries to predict the next visual inputs. It can predict its own movements from a first-person view but doesn’t map the mirrored body to its own, and thus, the
November 16, 2024 at 8:49 AM
Would love to see these examples scaled up into a larger benchmark about models of self.
November 11, 2024 at 3:54 PM
These videos of a humanoid watching itself in a mirror are a cool test of a “self” model. The robot’s world model tries to predict the next visual inputs. It can predict its movements from a first-person view but doesn’t map the mirrored body to its own, and thus, the mirrored view doesn't match.
November 11, 2024 at 3:53 PM
Hope this is a step towards understanding computation in the brain.
October 14, 2024 at 5:17 PM
Incredible to see that the fruit fly connectome has been completed. I remember as an undergraduate watching
Sebastian Seung in 2007 give the presidential lecture at SFN mysteriously titled: “The Once and Future Science of Neural Networks.”
October 14, 2024 at 5:15 PM
The full scientific work is published across 8 nature papers in a single issue: https://www.nature.com/collections/hgcfafejia
November 16, 2024 at 9:12 AM