Lightnews — Scholar-powered news

Petter Törnberg

@pettertornberg.com

Find my co-authors on Bluesky: @chrisbail.bsky.social @cbarrie.bsky.social

Colleagues who do excellent work in this field, and might find these results interesting:
@mbernst.bsky.social
@robbwiller.bsky.social
@joon-s-pk.bsky.social
@janalasser.bsky.social
@dgarcia.eu
@aaronshaw.bsky.social

November 7, 2025 at 11:19 AM

Petter Törnberg

@pettertornberg.com

This has been carried out by amazing Nicolò Pagan, with Chris Bail, Chris Barrie, and Anikó Hannák.

Paper (preprint): arxiv.org/abs/2511.04195

Happy to share prompts, configs, and analysis scripts.

Computational Turing Test Reveals Systematic Differences Between Human and AI Language

Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assumption rem...

arxiv.org

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

Takeaways for researchers:
• LLMs are worse stand-ins for humans than they may appear.
• Don’t rely on human judges.
• Measure detectability and meaning.
• Expect a style–meaning trade-off.
• Use examples + context, not personas.
• Affect is still the biggest giveaway.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

We also found some surprising trade-offs:
🎭 When models sound more human, they drift from what people actually say.
🧠 When they match meaning better, they sound less human.

Style or meaning — you have to pick one.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

So what actually helps?
Not personas. And fine-tuning? Not always.

The real improvements came from:
✅ Providing stylistic examples of the user
✅ Adding context retrieval from past posts

Together, these reduced detectability by 4-16 percentage points.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

Some findings surprised us:
⚙️ Instruction-tuned models — the ones fine-tuned to follow prompts — are easier to detect than their base counterparts.
📏 Model size doesn’t help: even 70B models don’t sound more human.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

Where do LLMs give themselves away?

❤️ Affective tone and emotion — the clearest tell.
✍️ Stylistic markers — average word length, toxicity, hashtags, emojis.
🧠 Topic profiles — especially on Reddit, where conversations are more diverse and nuanced.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

The results were clear — and surprising.
Even short social media posts written by LLMs are readily distinguishable.

Our BERT-based classifier spots AI with 70–80% accuracy across X, Bluesky, and Reddit.

LLMs are much less human-like than they may seem.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

We test the state-of-the-art methods for calibrating LLMs — and then push further, using advanced fine-tuning.

We benchmark 9 open-weight LLMs across 5 calibration strategies:
👤 Persona
✍️ Stylistic examples
🧩 Context retrieval
⚙️ Fine-tuning
🎯 Post-generation selection

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

We use our Computational Turing Test to see whether LLMs can produce realistic social media conversations.

We use data from X (Twitter), Bluesky, and Reddit.

This task is arguably what LLMs should do best: they are literally trained on this data!

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

We introduce a Computational Turing Test — a validation framework that compares human and LLM text using:

🕵️‍♂️ Detectability — can an ML classifier tell AI from human?

🧠 Semantic fidelity — does it mean the same thing?

✍️ Interpretable linguistic features — style, tone, topics.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

Most prior work validated "human-likeness" with human judges. Basically, do people think it looks human?

But humans are actually really bad at this task: we are subjective, scale poorly, and very easy to fool.

We need something more rigorous.

November 7, 2025 at 11:13 AM

Petter Törnberg

@pettertornberg.com

The battlefield of misinformation isn’t just about facts.
It’s about form.

Design and aesthetics have become powerful weapons - shaping what feels rational, what seems credible, and who gets to speak for science.

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

This aesthetic strategy expands denialism’s reach.
It appeals to audiences who’d never click on conspiracies -
because it looks like reason, not ideology.

By mimicking science, denialists perform neutrality while undermining it.

This isn’t just denial.
It’s strategic depoliticization.

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

Meanwhile, climate researchers and activists are portrayed as emotional and irrational:
😢 Crying protesters
⚠️ Angry crowds
🚫 “Ideological fanatics”

The contrast is deliberate:
Climate denial looks calm and factual.
Climate action looks hysterical and extreme.

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

These posts could pass for pages from a scientific report -
except they twist or cherry-pick data to cast doubt on climate science.

They give misinformation the aesthetics of rationality:
white men in white lab coats pointing at complicated graphs.

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

When we examined the visual language of climate misinformation, the results were striking

We found what we call "scientific mimicry".

Much of it borrows the look and feel of science:
clean graphs, neutral tones, and technical diagrams that perform objectivity.

It looks like science - but it’s not

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

On social media, content is no longer just text -
it’s text wrapped in images and motion.

Visuals travel faster, trigger emotion more easily, and slip past critical thought.
That’s what makes them such fertile ground for misinformation -
and yet, we’ve barely studied them.

November 4, 2025 at 8:48 PM

Petter Törnberg

@pettertornberg.com

Yeah it should be noted that the ANES data only includes 18+ US citizens.

But this does track with my BSc students. They seem to be much less online than I.

October 30, 2025 at 2:16 PM

Petter Törnberg

@pettertornberg.com

Here's the full preprint.

Feel free to write me if you want any additional analyses in the final version!

arxiv.org/abs/2510.25417

Shifts in U.S. Social Media Use, 2020-2024: Decline, Fragmentation, and Enduring Polarization

Using nationally representative data from the 2020 and 2024 American National Election Studies (ANES), this paper traces how the U.S. social media landscape has shifted across platforms, demographics,...

arxiv.org

October 30, 2025 at 8:09 AM

Petter Törnberg

@pettertornberg.com

Posting is correlated with affective polarization:
😡 The most partisan users — those who love their party and despise the other — are more likely to post about politics
🥊 The result? A loud angry minority dominates online politics, which itself can drive polarization (see doi.org/10.1073/pnas...)

October 30, 2025 at 8:09 AM

Petter Törnberg

@pettertornberg.com

Twitter/X is a story on its own:

🔴 While users have become more Republican
💥 POSTING has completely transformed: it has moved nearly ❗50 percentage points❗ from Democrat-dominated to slightly Republican-leaning.

October 30, 2025 at 8:09 AM

Petter Törnberg

@pettertornberg.com

Politically, the landscape is shifting too:

🔴 Nearly all platforms have become more Republican
🔵 But they remain Democratic-leaning overall
🏃‍♂️ Democrats are fleeing to smaller platforms (Bluesky, Threads, Mastodon)

October 30, 2025 at 8:09 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news