Austin van Loon
austin-van-loon.bsky.social
Austin van Loon
@austin-van-loon.bsky.social
Assistant professor of work and organization studies at MIT Sloan School of Management. I’m a sociologists and computational social scientist studying culture, identity, intergroup conflict, and other things I find interesting or important.
“…the mixed [subjects] design… is the most reasonable approach I’ve seen in the LLMs for social science literature for integrating LLM simulations into confirmatory-style experiments…”

Also provides thoughtful reflections on the limitations of the approach!
June 13, 2025 at 10:20 AM
Thank you @davidbroska.bsky.social and Michael for your leadership on this paper. It’s been a joy working with you both. Onwards to more breakthroughs in computational social science!
February 18, 2025 at 11:55 AM
💡 LLMs aren’t going anywhere. As social scientists, we can either ignore them or work to integrate them rigorously into the research process. We hope this is a step in the right direction—leveraging what LLMs can tell us about human behavior while preserving scientific rigor.
February 18, 2025 at 11:55 AM
But wait, there’s more! Since many social scientists are new to PPI, we built tools—like a PPI power analysis. With an estimated treatment effect and assumed interchangeability, our tool optimizes budget allocation between expensive human data and possibly biased LLM responses (!!)
February 18, 2025 at 11:55 AM
3️⃣ The more informative LLM behavior is about human behavior, the more we learn from the silicon data—and the more precise the estimate becomes. As models and prompting improve in the future, the estimate ✨automagically✨ adjusts!
February 18, 2025 at 11:55 AM
Key properties of this estimator:

1️⃣ It adjusts for silicon bias—if your human-only estimate is unbiased, so is this estimate.

2️⃣ If LLM data adds no info, the mixed estimate is as precise as the human-only one.
February 18, 2025 at 11:55 AM
How does it work? We collect a small set of LLM responses that “match” gold-standard human data (think silicon twins). Leveraging prediction-powered inference, we use these to estimate and correct for biases in the LLM, then combine with more LLM data. More details in the paper!
February 18, 2025 at 11:55 AM
What if you didn’t need the interchangeability assumption? My coauthors (David Broska and Michael Howes) and I propose a “mixed‐subjects” approach: collecting human and silicon data together to produce estimates about humans that are unbiased and more precise. 🙂➕🤖
February 18, 2025 at 11:55 AM
However, silicon sampling today often relies on an untested interchangeability assumption—that LLMs act like humans (at least on average). But, we a TON of evidence now that LLMs misrepresent human behavior/opinions in all kinds of ways. There’s good reason to be skeptical. 🤔
February 18, 2025 at 11:55 AM
If effective, silicon sampling could make social science more efficient, safe, and equitable. Clearly LLMs contain a lot of information about us—it would be foolish (and irresponsible) to just ignore it.
February 18, 2025 at 11:55 AM
Since the generative AI boom, researchers have flirted with “silicon sampling”—using LLMs to simulate human subjects. For instance, ask an LLM how it’d answer survey questions after experiencing an experimental stimulus, then treat its responses as data.
February 18, 2025 at 11:55 AM