Cameron Jones
banner
camrobjones.bsky.social
Cameron Jones
@camrobjones.bsky.social
Postdoc in the Language and Cognition lab at UC San Diego. I’m interested in persuasion, deception, LLMs, and social intelligence.
One of the most important aspects of the Turing test is that it's not static: it depends on people's assumptions about other humans and technology. We agree with
@brianchristian.bsky.social that humans could (and should) come back better next year!
April 1, 2025 at 3:14 PM
As in previous work, people focused more on linguistic and socioemotional factors in their strategies & reasons. This might suggest people no longer see "classical" intelligence (e.g. math, knowledge, reasoning) as a good way of discriminating people from machines.
April 1, 2025 at 3:14 PM
We also tried giving a more basic prompt to the models, without detailed instructions on the persona to adopt. Models performed significantly worse in this condition (highlighting the importance of prompting), but were still indistinguishable from humans in the Prolific study.
April 1, 2025 at 3:14 PM
Across 2 studies (on undergrads and Prolific) GPT-4.5 was selected as the human significantly more often than chance (50%). LLaMa was not selected significantly more or less often than humans, suggesting ppts couldn't distinguish it from people. Baselines (ELIZA & GPT-4o) were worse than chance.
April 1, 2025 at 3:14 PM
Participants spoke to two "witnesses" at the same time: one human and one AI. Here are some example convos from the study. Can you tell which one is the human? Answers & original interrogator verdicts in the paper...

You can play the game yourself here: turingtest.live
April 1, 2025 at 3:14 PM
In previous work we found GPT-4 was judged to be human ~50% of the time in a 2-party Turing test, where ppts speak to *either* a human or a model.

This is probably easier for several reasons. Here we ran a new study with Turing's original 3-party setup

arxiv.org/abs/2503.23674
April 1, 2025 at 3:14 PM
New preprint: we evaluated LLMs in a 3-party Turing test (participants speak to a human & AI simultaneously and decide which is which).

GPT-4.5 (when prompted to adopt a humanlike persona) was judged to be the human 73% of the time, suggesting it passes the Turing test (🧵)
April 1, 2025 at 3:14 PM
@yann-lecun.bsky.social at #StandUpForScience NYC in Washington Square Park — “I work on both natural and artificial intelligence, and I think this government could do with a little more intelligence.”
March 7, 2025 at 5:46 PM
We're relaunching turingtest.live on Thursday at 1pm GMT / 8am ET / 5am PT. The new site will use a 3 player format where you speak to a human and an AI simultaneously and decide which is which! We're also testing a variety of new prompting approaches.
December 9, 2024 at 4:56 PM