Jamie Cummins
@jamiecummins.bsky.social
Currently a visiting researcher at Uni of Oxford. Normally at Uni of Bern.
Meta-scientist building tools to help other scientists. NLP, simulation, & LLMs.
Creator and developer of RegCheck (https://regcheck.app).
1/4 of @error.reviews.
🇮🇪
Meta-scientist building tools to help other scientists. NLP, simulation, & LLMs.
Creator and developer of RegCheck (https://regcheck.app).
1/4 of @error.reviews.
🇮🇪
Pinned
Jamie Cummins
@jamiecummins.bsky.social
· Jul 23
RegCheck.app
RegCheck is an AI tool to compare preregistrations with papers instantly.
regcheck.app
Introducing RegCheck: a tool which uses Large Language Models to automatically compare preregistered protocols with their corresponding published papers and highlights deviations.
@malte.the100.ci @ianhussey.bsky.social @ruben.the100.ci @bjoernhommel.bsky.social
regcheck.app
@malte.the100.ci @ianhussey.bsky.social @ruben.the100.ci @bjoernhommel.bsky.social
regcheck.app
Reposted by Jamie Cummins
Delighted to support MU Psych Soc's invited lecture on Forensic Metascience by departmental alum, Dr Jamie Cummins @jamiecummins.bsky.social whose work in this area seeks to enhance rigour & accuracy in scientific reporting.
Sincere thanks to Dr Cummins. #MUPsychologyAt25
Sincere thanks to Dr Cummins. #MUPsychologyAt25
November 7, 2025 at 12:36 PM
Delighted to support MU Psych Soc's invited lecture on Forensic Metascience by departmental alum, Dr Jamie Cummins @jamiecummins.bsky.social whose work in this area seeks to enhance rigour & accuracy in scientific reporting.
Sincere thanks to Dr Cummins. #MUPsychologyAt25
Sincere thanks to Dr Cummins. #MUPsychologyAt25
Reposted by Jamie Cummins
LLMs are now widely used in social science as stand-ins for humans—assuming they can produce realistic, human-like text
But... can they? We don’t actually know.
In our new study, we develop a Computational Turing Test.
And our findings are striking:
LLMs may be far less human-like than we think.🧵
But... can they? We don’t actually know.
In our new study, we develop a Computational Turing Test.
And our findings are striking:
LLMs may be far less human-like than we think.🧵
Computational Turing Test Reveals Systematic Differences Between Human and AI Language
Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assumption rem...
arxiv.org
November 7, 2025 at 11:13 AM
LLMs are now widely used in social science as stand-ins for humans—assuming they can produce realistic, human-like text
But... can they? We don’t actually know.
In our new study, we develop a Computational Turing Test.
And our findings are striking:
LLMs may be far less human-like than we think.🧵
But... can they? We don’t actually know.
In our new study, we develop a Computational Turing Test.
And our findings are striking:
LLMs may be far less human-like than we think.🧵
It was such an honour and privilege to be back at my alma mater 9 years (!!!) after finishing my undergraduate degree to give a talk as part of psych department's 25 year anniversary!
Lovely to welcome back Dr @jamiecummins.bsky.social for tonight's @mupsychology.bsky.social talk as part of our #MUpsychologyAt25 events @maynoothuniversity.ie
November 7, 2025 at 10:58 AM
It was such an honour and privilege to be back at my alma mater 9 years (!!!) after finishing my undergraduate degree to give a talk as part of psych department's 25 year anniversary!
Reposted by Jamie Cummins
Lovely to welcome back Dr @jamiecummins.bsky.social for tonight's @mupsychology.bsky.social talk as part of our #MUpsychologyAt25 events @maynoothuniversity.ie
November 6, 2025 at 6:48 PM
Lovely to welcome back Dr @jamiecummins.bsky.social for tonight's @mupsychology.bsky.social talk as part of our #MUpsychologyAt25 events @maynoothuniversity.ie
My master thesis file name on my old university's thesis archive site still makes me chuckle.
October 30, 2025 at 12:21 PM
My master thesis file name on my old university's thesis archive site still makes me chuckle.
Reposted by Jamie Cummins
This year Demis Hassabis predicted AI could cure all disease in a decade.
But other scientists like Claus Wilke & Derek Lowe say biology is far more complex, or progress will be limited by clinical trials & economics.
In a new 4hr podcast episode of *Hard Drugs*, we answer: Will AI solve medicine?
But other scientists like Claus Wilke & Derek Lowe say biology is far more complex, or progress will be limited by clinical trials & economics.
In a new 4hr podcast episode of *Hard Drugs*, we answer: Will AI solve medicine?
Will AI solve medicine?
spotify.link
October 29, 2025 at 2:11 PM
This year Demis Hassabis predicted AI could cure all disease in a decade.
But other scientists like Claus Wilke & Derek Lowe say biology is far more complex, or progress will be limited by clinical trials & economics.
In a new 4hr podcast episode of *Hard Drugs*, we answer: Will AI solve medicine?
But other scientists like Claus Wilke & Derek Lowe say biology is far more complex, or progress will be limited by clinical trials & economics.
In a new 4hr podcast episode of *Hard Drugs*, we answer: Will AI solve medicine?
Reposted by Jamie Cummins
I built a DAG diagram with garden hoses for teaching.
Pictured: a collider bias diagram, inspired by a blocked pipe situation I experienced (which I credit with giving me the intuition though it also ruined my belongings in the flooded cellar).
Pictured: a collider bias diagram, inspired by a blocked pipe situation I experienced (which I credit with giving me the intuition though it also ruined my belongings in the flooded cellar).
October 28, 2025 at 5:50 PM
I built a DAG diagram with garden hoses for teaching.
Pictured: a collider bias diagram, inspired by a blocked pipe situation I experienced (which I credit with giving me the intuition though it also ruined my belongings in the flooded cellar).
Pictured: a collider bias diagram, inspired by a blocked pipe situation I experienced (which I credit with giving me the intuition though it also ruined my belongings in the flooded cellar).
Reposted by Jamie Cummins
Reposted by Jamie Cummins
Can AI simulations of human research participants advance cognitive science? In @cp-trendscognsci.bsky.social, @lmesseri.bsky.social & I analyze this vision. We show how “AI Surrogates” entrench practices that limit the generalizability of cognitive science while aspiring to do the opposite. 1/
AI Surrogates and illusions of generalizability in cognitive science
Recent advances in artificial intelligence (AI) have generated enthusiasm for using AI simulations of human research participants to generate new know…
www.sciencedirect.com
October 21, 2025 at 8:24 PM
Can AI simulations of human research participants advance cognitive science? In @cp-trendscognsci.bsky.social, @lmesseri.bsky.social & I analyze this vision. We show how “AI Surrogates” entrench practices that limit the generalizability of cognitive science while aspiring to do the opposite. 1/
Reposted by Jamie Cummins
New hobby:
Remaking article abstracts as movie trailers to expose hype and fearmongering.
Remaking article abstracts as movie trailers to expose hype and fearmongering.
October 20, 2025 at 10:22 AM
New hobby:
Remaking article abstracts as movie trailers to expose hype and fearmongering.
Remaking article abstracts as movie trailers to expose hype and fearmongering.
Reposted by Jamie Cummins
"Silicon samples" - using LLMs to generate fake survey responses instead of recruiting humans. Sounds efficient until you realize small model tweaks completely flip your results. Shortcuts in research usually aren't.
The threat of analytic flexibility in using large language models to simulate human data: A call to attention
Social scientists are now using large language models to create "silicon samples" - synthetic datasets intended to stand in for human respondents, aimed at revolutionising human subjects research.…
arxiv.org
October 9, 2025 at 1:08 PM
"Silicon samples" - using LLMs to generate fake survey responses instead of recruiting humans. Sounds efficient until you realize small model tweaks completely flip your results. Shortcuts in research usually aren't.
Reposted by Jamie Cummins
Psychologists running empirical studies to rediscover engineering design choices is such a strange genre of papers. By all means, run studies on LLM judgments -- but what else than lexical co-occurence and statistical priors would they be based on??
Evidence that even when LLMs produce similar results to humans, they “rely on lexical associations and statistical priors rather than contextual reasoning or normative criteria. We term this divergence epistemia: the illusion of knowledge emerging when surface plausibility replaces verification”
PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
www.pnas.org
October 17, 2025 at 10:59 AM
Psychologists running empirical studies to rediscover engineering design choices is such a strange genre of papers. By all means, run studies on LLM judgments -- but what else than lexical co-occurence and statistical priors would they be based on??
Reposted by Jamie Cummins
Reposted by Jamie Cummins
New episode of Hard Drugs!
What if you could design a protein never seen in nature?
Scientists are using new AI tools like RFDiffusion, AlphaFold & ProteinMPNN to hallucinate novel proteins to solve problems nature hasn't.
@jacobtref.bsky.social & I talk about the art of protein design 🧑🎨
What if you could design a protein never seen in nature?
Scientists are using new AI tools like RFDiffusion, AlphaFold & ProteinMPNN to hallucinate novel proteins to solve problems nature hasn't.
@jacobtref.bsky.social & I talk about the art of protein design 🧑🎨
The art of protein design with AI
YouTube video by Works in Progress
www.youtube.com
October 15, 2025 at 3:08 PM
New episode of Hard Drugs!
What if you could design a protein never seen in nature?
Scientists are using new AI tools like RFDiffusion, AlphaFold & ProteinMPNN to hallucinate novel proteins to solve problems nature hasn't.
@jacobtref.bsky.social & I talk about the art of protein design 🧑🎨
What if you could design a protein never seen in nature?
Scientists are using new AI tools like RFDiffusion, AlphaFold & ProteinMPNN to hallucinate novel proteins to solve problems nature hasn't.
@jacobtref.bsky.social & I talk about the art of protein design 🧑🎨
Reposted by Jamie Cummins
Major win for our field: finally a large, replicable effect.
Results of the replication are in!
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Make an effect size prediction!
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
October 15, 2025 at 11:29 AM
Major win for our field: finally a large, replicable effect.
These results are also worth reiterating the title of @ianhussey.mmmdata.io's recent blog post: if researchers find Cohen's d = 8, no they didn't
mmmdata.io/posts/2025/0...
mmmdata.io/posts/2025/0...
October 15, 2025 at 1:27 PM
These results are also worth reiterating the title of @ianhussey.mmmdata.io's recent blog post: if researchers find Cohen's d = 8, no they didn't
mmmdata.io/posts/2025/0...
mmmdata.io/posts/2025/0...
Reposted by Jamie Cummins
Results of the replication are in!
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Make an effect size prediction!
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
October 14, 2025 at 6:16 PM
Results of the replication are in!
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Chocolate is more desirable than poop:
Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]
N = 486, two single item 1-7 Likert scales of desirability.
w/
@jamiecummins.bsky.social
Reposted by Jamie Cummins
Please help us, #MetaScience community!
It's time to decide on a forever name for papercheck (scienceverse.github.io/papercheck/). We don't want it to be confused with papercheck.ai, and we plan to check other research artifacts like repo contents, data, code, and prereg. Any suggestions?
It's time to decide on a forever name for papercheck (scienceverse.github.io/papercheck/). We don't want it to be confused with papercheck.ai, and we plan to check other research artifacts like repo contents, data, code, and prereg. Any suggestions?
Check Scientific Papers for Best Practices
A modular, extendable system for automatically checking scientific papers for best practices using text search, R code, and/or (optional) LLM queries.
scienceverse.github.io
October 14, 2025 at 1:06 PM
Please help us, #MetaScience community!
It's time to decide on a forever name for papercheck (scienceverse.github.io/papercheck/). We don't want it to be confused with papercheck.ai, and we plan to check other research artifacts like repo contents, data, code, and prereg. Any suggestions?
It's time to decide on a forever name for papercheck (scienceverse.github.io/papercheck/). We don't want it to be confused with papercheck.ai, and we plan to check other research artifacts like repo contents, data, code, and prereg. Any suggestions?
Reposted by Jamie Cummins
Make an effect size prediction!
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
October 13, 2025 at 11:30 AM
Make an effect size prediction!
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)
Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
Reposted by Jamie Cummins
Introductory online INSPECT-SR workshop. November 6th, 12-2pm UK-time. Free, places limited. BOOK: www.trybooking.com/uk/events/la...
Introduction to INSPECT-SR Training Workshop November
An introductory 2-hour online workshop will introduce participants to the INSPECT-SR tool for assessing trustworthiness of randomised controlled...
www.trybooking.com
October 3, 2025 at 11:08 AM
Introductory online INSPECT-SR workshop. November 6th, 12-2pm UK-time. Free, places limited. BOOK: www.trybooking.com/uk/events/la...
Reposted by Jamie Cummins
Issue 16 of RDM Weekly is out! 📬
It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!
rdmweekly.substack.com/p/rdm-weekly...
It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!
rdmweekly.substack.com/p/rdm-weekly...
RDM Weekly - Issue 016
A weekly roundup of Research Data Management resources.
rdmweekly.substack.com
October 7, 2025 at 12:56 PM
Issue 16 of RDM Weekly is out! 📬
It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!
rdmweekly.substack.com/p/rdm-weekly...
It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!
rdmweekly.substack.com/p/rdm-weekly...
Reposted by Jamie Cummins
Interesting article/paper.
I'm much less anti-AI than a lot of people on my feed. But pretty skeptical it can simulate human behavior effectively for social scientific purposes -- at least in cases where variation among humans, rather than acting like an average human, is what's important.
I'm much less anti-AI than a lot of people on my feed. But pretty skeptical it can simulate human behavior effectively for social scientific purposes -- at least in cases where variation among humans, rather than acting like an average human, is what's important.
AI-generated ‘participants’ can lead social science experiments astray, study finds
Data produced by “silicon samples” depends on researchers’ exact choice of models, prompts, and settings
www.science.org
October 4, 2025 at 11:24 AM
Interesting article/paper.
I'm much less anti-AI than a lot of people on my feed. But pretty skeptical it can simulate human behavior effectively for social scientific purposes -- at least in cases where variation among humans, rather than acting like an average human, is what's important.
I'm much less anti-AI than a lot of people on my feed. But pretty skeptical it can simulate human behavior effectively for social scientific purposes -- at least in cases where variation among humans, rather than acting like an average human, is what's important.
Reposted by Jamie Cummins
My article "Data is not available upon request" was published in Meta-Psychology. Very happy to see this out!
open.lnu.se/index.php/me...
open.lnu.se/index.php/me...
LnuOpen
| Meta-Psychology
open.lnu.se
October 4, 2025 at 12:54 PM
My article "Data is not available upon request" was published in Meta-Psychology. Very happy to see this out!
open.lnu.se/index.php/me...
open.lnu.se/index.php/me...
Reposted by Jamie Cummins
New episode of HARD DRUGS!
AlphaFold, ProteinMPNN & other AI tools are transforming biology and drug design.
But how do they work? What can’t they do? And can we use them to make a vaccine against Strep A for the very first time?
In this episode, Jacob and I talk about hacking proteins with AI.
AlphaFold, ProteinMPNN & other AI tools are transforming biology and drug design.
But how do they work? What can’t they do? And can we use them to make a vaccine against Strep A for the very first time?
In this episode, Jacob and I talk about hacking proteins with AI.
Hacking proteins with AI
open.spotify.com
October 1, 2025 at 4:23 PM
New episode of HARD DRUGS!
AlphaFold, ProteinMPNN & other AI tools are transforming biology and drug design.
But how do they work? What can’t they do? And can we use them to make a vaccine against Strep A for the very first time?
In this episode, Jacob and I talk about hacking proteins with AI.
AlphaFold, ProteinMPNN & other AI tools are transforming biology and drug design.
But how do they work? What can’t they do? And can we use them to make a vaccine against Strep A for the very first time?
In this episode, Jacob and I talk about hacking proteins with AI.