Tobias Gerstenberg
@tobigerstenberg.bsky.social
It was great to hear from Brian Leahy (brianleahy.net) in the devo lunch at Stanford today!!
He presented a beautiful set of studies that suggest that many 4-year-old children have a minimal concept of possibility: they simulate only once and treat the outcome as a fact. 🎱⬅️➡️🤔💭💡
He presented a beautiful set of studies that suggest that many 4-year-old children have a minimal concept of possibility: they simulate only once and treat the outcome as a fact. 🎱⬅️➡️🤔💭💡
November 5, 2025 at 11:05 PM
It was great to hear from Brian Leahy (brianleahy.net) in the devo lunch at Stanford today!!
He presented a beautiful set of studies that suggest that many 4-year-old children have a minimal concept of possibility: they simulate only once and treat the outcome as a fact. 🎱⬅️➡️🤔💭💡
He presented a beautiful set of studies that suggest that many 4-year-old children have a minimal concept of possibility: they simulate only once and treat the outcome as a fact. 🎱⬅️➡️🤔💭💡
We develop a causal abstraction that infers a causal story of how the data was generated, paying more attention to factors that mattered for the prediction task. This model captures participants' generalization judgments better than a feature-based model, despite having many fewer parameters.
October 24, 2025 at 7:15 PM
We develop a causal abstraction that infers a causal story of how the data was generated, paying more attention to factors that mattered for the prediction task. This model captures participants' generalization judgments better than a feature-based model, despite having many fewer parameters.
This time, we also asked participants to predict what would happen in novel situations. For example, we showed them two familiar cubes on a novel ramp. These generalization trials also featured ramps that were facing the opposite direction from what they had seen before.
October 24, 2025 at 7:15 PM
This time, we also asked participants to predict what would happen in novel situations. For example, we showed them two familiar cubes on a novel ramp. These generalization trials also featured ramps that were facing the opposite direction from what they had seen before.
In Exp 3, participants either viewed forward-facing ramps, or backward-facing ones. The cubes always ended up on the right side. Again, after having learned to predict which cubes cross the finish line, we surprised and asked them where exactly the cubes would be. Ps made similar errors as in Exp 2.
October 24, 2025 at 7:15 PM
In Exp 3, participants either viewed forward-facing ramps, or backward-facing ones. The cubes always ended up on the right side. Again, after having learned to predict which cubes cross the finish line, we surprised and asked them where exactly the cubes would be. Ps made similar errors as in Exp 2.
Exp 2: Participants predict whether a cube on ramp will cross a finish line. Either cube color or ramp color is diagnostic. A surprise question about exactly where the cube will end up reveals systematic errors: they knew on which side of the line the cube would end up, but not the exact location.
October 24, 2025 at 7:15 PM
Exp 2: Participants predict whether a cube on ramp will cross a finish line. Either cube color or ramp color is diagnostic. A surprise question about exactly where the cube will end up reveals systematic errors: they knew on which side of the line the cube would end up, but not the exact location.
Exp 1: Participants learn whether color or shape matter for turning on a machine. In a surprise test, we ask them what they saw last. They frequently misremember (e.g., choosing a differently colored object when only the shape mattered). Only happens with enough evidence to learn the rule!
October 24, 2025 at 7:15 PM
Exp 1: Participants learn whether color or shape matter for turning on a machine. In a surprise test, we ask them what they saw last. They frequently misremember (e.g., choosing a differently colored object when only the shape mattered). Only happens with enough evidence to learn the rule!
When people are asked to predict what happens next, do they learn simple feature-outcome mappings, or do they learn causal models that capture the underlying generative process? If they do learn causal models, how can we tell? We ran 3 experiments (N=1080) using two paradigms to find out.
October 24, 2025 at 7:15 PM
When people are asked to predict what happens next, do they learn simple feature-outcome mappings, or do they learn causal models that capture the underlying generative process? If they do learn causal models, how can we tell? We ran 3 experiments (N=1080) using two paradigms to find out.
🚨 New preprint 🚨
How do people's mental models shape memory, prediction, and generalization? We find that people spontaneously construct goal-dependent causal abstractions that compress experience to privilege relevant information.
📃 osf.io/preprints/ps...
🔗 github.com/cicl-stanfor...
How do people's mental models shape memory, prediction, and generalization? We find that people spontaneously construct goal-dependent causal abstractions that compress experience to privilege relevant information.
📃 osf.io/preprints/ps...
🔗 github.com/cicl-stanfor...
October 24, 2025 at 7:15 PM
🚨 New preprint 🚨
How do people's mental models shape memory, prediction, and generalization? We find that people spontaneously construct goal-dependent causal abstractions that compress experience to privilege relevant information.
📃 osf.io/preprints/ps...
🔗 github.com/cicl-stanfor...
How do people's mental models shape memory, prediction, and generalization? We find that people spontaneously construct goal-dependent causal abstractions that compress experience to privilege relevant information.
📃 osf.io/preprints/ps...
🔗 github.com/cicl-stanfor...
The Causality in Cognition Lab at Stanford University is recruiting PhD students this cycle!
We are a supportive team who happened to wear bluesky appropriate colors for the lab photo (this wasn't planned). 💙
Lab info: cicl.stanford.edu
Application details: psychology.stanford.edu/admissions/p...
We are a supportive team who happened to wear bluesky appropriate colors for the lab photo (this wasn't planned). 💙
Lab info: cicl.stanford.edu
Application details: psychology.stanford.edu/admissions/p...
October 17, 2025 at 5:43 PM
The Causality in Cognition Lab at Stanford University is recruiting PhD students this cycle!
We are a supportive team who happened to wear bluesky appropriate colors for the lab photo (this wasn't planned). 💙
Lab info: cicl.stanford.edu
Application details: psychology.stanford.edu/admissions/p...
We are a supportive team who happened to wear bluesky appropriate colors for the lab photo (this wasn't planned). 💙
Lab info: cicl.stanford.edu
Application details: psychology.stanford.edu/admissions/p...
This project was expertly led by David Rose (davdrose.github.io) in collaboration with @siyingzhg.bsky.social, Sophie Bridgers, Hyo Gweon, and myself.
📄 doi.org/10.31234/osf...
🔗https://github.com/cicl-stanford/counterfactual_development
📄 doi.org/10.31234/osf...
🔗https://github.com/cicl-stanford/counterfactual_development
October 13, 2025 at 7:58 PM
This project was expertly led by David Rose (davdrose.github.io) in collaboration with @siyingzhg.bsky.social, Sophie Bridgers, Hyo Gweon, and myself.
📄 doi.org/10.31234/osf...
🔗https://github.com/cicl-stanford/counterfactual_development
📄 doi.org/10.31234/osf...
🔗https://github.com/cicl-stanford/counterfactual_development
So what did we find? We tested 480 children and 91 adults online. Participants saw 4 (Exp 1) or 6 different scenarios. We find that children perform above chance when they're around 5 years of age. And we find a marked shift in performance around 7 years of age (where most children seem to get it).
October 13, 2025 at 7:58 PM
So what did we find? We tested 480 children and 91 adults online. Participants saw 4 (Exp 1) or 6 different scenarios. We find that children perform above chance when they're around 5 years of age. And we find a marked shift in performance around 7 years of age (where most children seem to get it).
Three experiments rule out simpler explanations:
1️⃣ Different objects; children might answer based on preference.
2️⃣ Same objects; children might anticipate what would happen (hypothetical thinking).
3️⃣ Same objects, outcome revealed later; children need genuine counterfactual thinking.
1️⃣ Different objects; children might answer based on preference.
2️⃣ Same objects; children might anticipate what would happen (hypothetical thinking).
3️⃣ Same objects, outcome revealed later; children need genuine counterfactual thinking.
October 13, 2025 at 7:58 PM
Three experiments rule out simpler explanations:
1️⃣ Different objects; children might answer based on preference.
2️⃣ Same objects; children might anticipate what would happen (hypothetical thinking).
3️⃣ Same objects, outcome revealed later; children need genuine counterfactual thinking.
1️⃣ Different objects; children might answer based on preference.
2️⃣ Same objects; children might anticipate what would happen (hypothetical thinking).
3️⃣ Same objects, outcome revealed later; children need genuine counterfactual thinking.
The "dropping things" task removes language and tests genuine counterfactual thinking.
Granny drops two objects: an 🥚 and a 🏀. Two friends catch them. Granny would like to thank them but only has one sticker. Who should she give it to? Not catching the 🥚 would have been worse, so "Suzy"!
Granny drops two objects: an 🥚 and a 🏀. Two friends catch them. Granny would like to thank them but only has one sticker. Who should she give it to? Not catching the 🥚 would have been worse, so "Suzy"!
October 13, 2025 at 7:58 PM
The "dropping things" task removes language and tests genuine counterfactual thinking.
Granny drops two objects: an 🥚 and a 🏀. Two friends catch them. Granny would like to thank them but only has one sticker. Who should she give it to? Not catching the 🥚 would have been worse, so "Suzy"!
Granny drops two objects: an 🥚 and a 🏀. Two friends catch them. Granny would like to thank them but only has one sticker. Who should she give it to? Not catching the 🥚 would have been worse, so "Suzy"!
Estimates of when counterfactual thinking develops range from 2-12 years. Two potential reasons: language & reasoning.
💬 A question like: "Where would Peter have been if there hadn’t been a fire?” is difficult to understand!
🤔 Counterfactual and hypothetical thinking are different!
💬 A question like: "Where would Peter have been if there hadn’t been a fire?” is difficult to understand!
🤔 Counterfactual and hypothetical thinking are different!
October 13, 2025 at 7:58 PM
Estimates of when counterfactual thinking develops range from 2-12 years. Two potential reasons: language & reasoning.
💬 A question like: "Where would Peter have been if there hadn’t been a fire?” is difficult to understand!
🤔 Counterfactual and hypothetical thinking are different!
💬 A question like: "Where would Peter have been if there hadn’t been a fire?” is difficult to understand!
🤔 Counterfactual and hypothetical thinking are different!
🚨New Preprint: We develop a novel task that probes counterfactual thinking without using counterfactual language, and that teases apart genuine counterfactual thinking from related forms of thinking. Using this task, we find that the ability for counterfactual thinking emerges around 5 years of age.
October 13, 2025 at 7:58 PM
🚨New Preprint: We develop a novel task that probes counterfactual thinking without using counterfactual language, and that teases apart genuine counterfactual thinking from related forms of thinking. Using this task, we find that the ability for counterfactual thinking emerges around 5 years of age.
Thanks Evan Orticio (orticio.com) for sharing your fascinating work with us on how children and adults form beliefs without direct evidence.
In one super cool study, he shows how children become more diligent fact checkers in less reliable environments.
📃 orticio.com/assets/Ortic...
In one super cool study, he shows how children become more diligent fact checkers in less reliable environments.
📃 orticio.com/assets/Ortic...
October 10, 2025 at 10:14 PM
Thanks Evan Orticio (orticio.com) for sharing your fascinating work with us on how children and adults form beliefs without direct evidence.
In one super cool study, he shows how children become more diligent fact checkers in less reliable environments.
📃 orticio.com/assets/Ortic...
In one super cool study, he shows how children become more diligent fact checkers in less reliable environments.
📃 orticio.com/assets/Ortic...
I had a wonderful time visiting UC Irvine to give a talk in the cognitive science colloquium. Thank you @annaleshinskaya.bsky.social for being a fantastic host, and to all the other faculty, students, and postdocs I got to meet during my visit 🙏
October 2, 2025 at 8:12 PM
I had a wonderful time visiting UC Irvine to give a talk in the cognitive science colloquium. Thank you @annaleshinskaya.bsky.social for being a fantastic host, and to all the other faculty, students, and postdocs I got to meet during my visit 🙏
Simulating world models supports strong multimodal inferences. Prior work modeled multimodal inference as optimal averaging. But in the "Sound + Ball Occluded" condition each modality alone is useless (only hearing sounds, or only seeing obstacles). Combining both sources reveals what happened!
September 16, 2025 at 7:04 PM
Simulating world models supports strong multimodal inferences. Prior work modeled multimodal inference as optimal averaging. But in the "Sound + Ball Occluded" condition each modality alone is useless (only hearing sounds, or only seeing obstacles). Combining both sources reveals what happened!
The Sequential Sampler accurately captures people's judgments and eye-movements across the three inference conditions.
September 16, 2025 at 7:04 PM
The Sequential Sampler accurately captures people's judgments and eye-movements across the three inference conditions.
The model also predicts eye-movements. It assumes that people look at visual features of the scene, but also at dynamic features that are the consequence of mentally simulating how the ball would fall and collide with the obstacles and walls if it was dropped into the different holes.
September 16, 2025 at 7:04 PM
The model also predicts eye-movements. It assumes that people look at visual features of the scene, but also at dynamic features that are the consequence of mentally simulating how the ball would fall and collide with the obstacles and walls if it was dropped into the different holes.
In the prediction task, participants click 10 times where the ball will allowing them to express their uncertainty in a structured way. Their predictions are very well explained (r=0.99) by a physics simulation model that assumes that people are unsure about how the ball drops and how it collides.
September 16, 2025 at 7:04 PM
In the prediction task, participants click 10 times where the ball will allowing them to express their uncertainty in a structured way. Their predictions are very well explained (r=0.99) by a physics simulation model that assumes that people are unsure about how the ball drops and how it collides.
We created "Plinko" - a physics reasoning task where people:
🔮 PREDICT where a ball will land (forward reasoning)
🕵️ INFER where a ball came from using visual + auditory cues (backward reasoning)
🔮 PREDICT where a ball will land (forward reasoning)
🕵️ INFER where a ball came from using visual + auditory cues (backward reasoning)
September 16, 2025 at 7:04 PM
We created "Plinko" - a physics reasoning task where people:
🔮 PREDICT where a ball will land (forward reasoning)
🕵️ INFER where a ball came from using visual + auditory cues (backward reasoning)
🔮 PREDICT where a ball will land (forward reasoning)
🕵️ INFER where a ball came from using visual + auditory cues (backward reasoning)
🚨 NEW PREPRINT: Multimodal inference through mental simulation.
We examine how people figure out what happened by combining visual and auditory evidence through mental simulation.
Paper: osf.io/preprints/ps...
Code: github.com/cicl-stanfor...
We examine how people figure out what happened by combining visual and auditory evidence through mental simulation.
Paper: osf.io/preprints/ps...
Code: github.com/cicl-stanfor...
September 16, 2025 at 7:04 PM
🚨 NEW PREPRINT: Multimodal inference through mental simulation.
We examine how people figure out what happened by combining visual and auditory evidence through mental simulation.
Paper: osf.io/preprints/ps...
Code: github.com/cicl-stanfor...
We examine how people figure out what happened by combining visual and auditory evidence through mental simulation.
Paper: osf.io/preprints/ps...
Code: github.com/cicl-stanfor...