Max Kleiman-Weiner
maxkw.bsky.social
Max Kleiman-Weiner
@maxkw.bsky.social
professor at university of washington and founder at csm.ai. computational cognitive scientist. working on social and artificial intelligence and alignment.
http://faculty.washington.edu/maxkw/
Pinned
Our new paper is out in PNAS: "Evolving general cooperation with a Bayesian theory of mind"!

Humans are the ultimate cooperators. We coordinate on a scale and scope no other species (nor AI) can match. What makes this possible? 🧵

www.pnas.org/doi/10.1073/...
Evolving general cooperation with a Bayesian theory of mind | PNAS
Theories of the evolution of cooperation through reciprocity explain how unrelated self-interested individuals can accomplish more together than th...
www.pnas.org
Reposted by Max Kleiman-Weiner
Forget modeling every belief and goal! What if we represented people as following simple scripts instead (i.e "cross the crosswalk")?

Our new paper shows AI which models others’ minds as Python code 💻 can quickly and accurately predict human behavior!

shorturl.at/siUYI%F0%9F%...
October 3, 2025 at 2:24 AM
New paper challenges how we think about Theory of Mind. What if we model others as executing simple behavioral scripts rather than reasoning about complex mental states? Our algorithm, ROTE (Representing Others' Trajectories as Executables), treats behavior prediction as program synthesis.
October 3, 2025 at 5:01 AM
When values collide, what do LLMs choose? In our new paper, "Generative Value Conflicts Reveal LLM Priorities," we generate scenarios where values are traded off against each other. We find models prioritize "protective" values in multiple-choice, but shift toward "personal" values when interacting.
🚨New Paper: LLM developers aim to align models with values like helpfulness or harmlessness. But when these conflict, which values do models choose to support? We introduce ConflictScope, a fully-automated evaluation pipeline that reveals how models rank values under conflict.
(📷 xkcd)
October 2, 2025 at 6:37 PM
Excited by our new work estimating the empowerment of LLM-based agents in text and code. Empowerment is the causal influence an agent has over its environment and measures an agent's capabilities without requiring knowledge of its goals or intentions.
October 1, 2025 at 4:27 AM
Claire's new work showing that when an assistant aims to optimize another's empowerment, it can lead to others being disempowered (both as a side effect and as an intentional outcome)!
Still catching up on my notes after my first #cogsci2025, but I'm so grateful for all the conversations and new friends and connections! I presented my poster "When Empowerment Disempowers" -- if we didn't get the chance to chat or you would like to chat more, please reach out!
August 6, 2025 at 10:44 PM
Reposted by Max Kleiman-Weiner
Still catching up on my notes after my first #cogsci2025, but I'm so grateful for all the conversations and new friends and connections! I presented my poster "When Empowerment Disempowers" -- if we didn't get the chance to chat or you would like to chat more, please reach out!
August 6, 2025 at 10:31 PM
Reposted by Max Kleiman-Weiner
lol this may be the most cogsci cogsci slide I've ever seen, from @maxkw.bsky.social

"before I got married I had six theories about raising children, now I have six kids and no theories"......but here's another theory #cogsci2025
July 31, 2025 at 6:18 PM
Our new paper is out in PNAS: "Evolving general cooperation with a Bayesian theory of mind"!

Humans are the ultimate cooperators. We coordinate on a scale and scope no other species (nor AI) can match. What makes this possible? 🧵

www.pnas.org/doi/10.1073/...
Evolving general cooperation with a Bayesian theory of mind | PNAS
Theories of the evolution of cooperation through reciprocity explain how unrelated self-interested individuals can accomplish more together than th...
www.pnas.org
July 22, 2025 at 6:04 AM
Reposted by Max Kleiman-Weiner
As always, CogSci has a fantastic lineup of workshops this year. An embarrassment of riches!

Still deciding which to pick? If you are interested in building computational models of social cognition, I hope you consider joining @maxkw.bsky.social, @dae.bsky.social, and me for a crash course on memo!
#Workshop at #CogSci2025
Building computational models of social cognition in memo

🗓️ Wednesday, July 30
📍 Pacifica I - 8:30-10:00
🗣️ Kartik Chandra, Sean Dae Houlihan, and Max Kleiman-Weiner
🧑‍💻 underline.io/events/489/s...
July 18, 2025 at 1:56 PM
Very excited for this workshop!
#Workshop at #CogSci2025
Building computational models of social cognition in memo

🗓️ Wednesday, July 30
📍 Pacifica I - 8:30-10:00
🗣️ Kartik Chandra, Sean Dae Houlihan, and Max Kleiman-Weiner
🧑‍💻 underline.io/events/489/s...
July 17, 2025 at 4:42 AM
Reposted by Max Kleiman-Weiner
#Workshop at #CogSci2025
Building computational models of social cognition in memo

🗓️ Wednesday, July 30
📍 Pacifica I - 8:30-10:00
🗣️ Kartik Chandra, Sean Dae Houlihan, and Max Kleiman-Weiner
🧑‍💻 underline.io/events/489/s...
July 16, 2025 at 8:32 PM
Reposted by Max Kleiman-Weiner
'Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination'

@kjha02.bsky.social · Wilka Carvalho · Yancheng Liang · Simon Du ·
@maxkw.bsky.social · @natashajaques.bsky.social

doi.org/10.48550/arX...

(3/20)
July 15, 2025 at 1:44 PM
Settling in for my flight and apparently A.I. DOOM is now a movie genre between Harry Potter and Classics. Nothing better than an existential crisis with pretzels and a ginger ale.
June 29, 2025 at 10:52 PM
Reposted by Max Kleiman-Weiner
Thanks to the Diverse Intelligence Community for all these inspiring days & impressions in Sydney 🙏🏻 @chriskrupenye.bsky.social @katelaskowski.bsky.social @divintelligence.bsky.social @maxkw.bsky.social
June 28, 2025 at 3:46 AM
LLMs learn beliefs and values from human data, influence our opinions, and then reabsorb those influenced beliefs, feeding them back to users again and again. We call this the "Lock-In Hypothesis" and develop theory, simulations, and empirics to test it in our latest ICML paper!
June 9, 2025 at 8:23 PM
Excited to speak about some new work on Bayesian Cooperation at this workshop! Join us virtually
I am really happy to share information about the 2nd Workshop on Modeling and Applications of Evolutionary Game Theory, which will be held virtually on Thursday, May 8, and Friday, May 9.

sites.google.com/view/2nd-evo...

I enjoyed organizing this workshop with Olivia Chu and Alex McAvoy.
Workshop on Modeling and Applications of Evolutionary Game Theory
Dates: May 8-9, 2025 Location: Held virtually via Zoom
sites.google.com
April 28, 2025 at 9:14 PM
Reposted by Max Kleiman-Weiner
Now out in JPSP ‼️

"Inference from social evaluation" with Zach Davis, Kelsey Allen, @maxkw.bsky.social, and @julianje.bsky.social

📃 (paper): psycnet.apa.org/record/2026-...
📜 (preprint): osf.io/preprints/ps...
April 25, 2025 at 3:55 PM
Reposted by Max Kleiman-Weiner
Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity.

Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks.

shorturl.at/fqsNN%F0%9F%...
April 19, 2025 at 12:06 AM
Awesome new work from my lab led by @kjha02.bsky.social scaling cooperative AI! True cooperation requires adapting to both unfamiliar partners and novel environments. Agents trained with CEC get us closer to agents that can act with general cooperative principles rather than memorized strategies.
Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity.

Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks.

shorturl.at/fqsNN%F0%9F%...
April 19, 2025 at 6:24 AM
How AlphaGo like architectures can explain human insight. Out now in Cognition!
March 14, 2025 at 3:26 PM
Reposted by Max Kleiman-Weiner
my paper with max, @maxkw.bsky.social, tuomas, and @fierycushman.bsky.social out in cognition at long last www.sciencedirect.com/science/arti...

We explain why humans and successful AI planners both fail on a certain kind of problem that we might describe as requiring insight or creativity
Similar failures of consideration arise in human and machine planning
Humans are remarkably efficient at decision making, even in “open-ended” problems where the set of possible actions is too large for exhaustive evalua…
www.sciencedirect.com
March 14, 2025 at 3:21 PM
Accepted as a Spotlight in ICLR2025!
Honored to receive the Best Paper Award at the #NeurIPS2024 Pluralistic Alignment Workshop

Check out our preprint "Language Model Alignment in Multilingual Trolley Problems" at arxiv.org/pdf/2407.02273!
arxiv.org
February 13, 2025 at 11:43 PM
Emergent transition from code to natural language for reasoning tasks when RL tuning a language model for math. Interesting to consider implications for "Language of Thought" style theories in cognition.

hkust-nlp.notion.site/simplerl-rea...
January 26, 2025 at 6:32 AM
Reposted by Max Kleiman-Weiner
🔊 New paper just accepted in JPSP 🥳

In "Inference from social evaluation", we explore how people use social evaluations, such as judgments of blame or praise, to figure out what happened.

📜 osf.io/preprints/ps...

📎 github.com/cicl-stanfor...

1/6
January 23, 2025 at 4:53 PM
Very nice to see our work on LLM agent cooperation covered in Wired! www.wired.com/story/ai-soc...
January 9, 2025 at 6:17 PM