Johannes B. Gruber
banner
jbgruber.bsky.social
Johannes B. Gruber
@jbgruber.bsky.social
Senior Researcher @gesis.org // Data Editor @polcommjournal.bsky.social

🔎 political communication (#polsky + #commsky) with text analysis and #rstats (#opendata + #openscience)

🌏 JohannesBGruber.eu

👨‍💻 research software github.com/JBGruber
Pinned
Some big personal/professional news: starting next month, I will be leading a team in the Data Services for the Social Sciences department at @gesis.org (in Cologne)!
Reposted by Johannes B. Gruber
Misinformation research has a causality problem: lab experiments are limited; observational studies confounded.

We used causal inference on 9.9M tweets, quantifying effects in the wild while blocking backdoor paths.

Does misinfo get higher engagement? Are following discussions more emotional? 🧵
OSF
osf.io
November 11, 2025 at 9:59 AM
Very happy to update the {traktok} #rstats readme. After 1.5 years, you can finally search TikTok again without access to the Research API. It's slow and a bit clunky, but it works! Thanks, @michaelgoodier.bsky.social for the crucial hint!
November 11, 2025 at 8:38 AM
Reposted by Johannes B. Gruber
🚨🎉New Publication Friday 🎉🚨

Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians

w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social

Out now in Political Communication.

#polisky #commsky

doi.org/10.1080/1058...
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
Social media platforms now play a central role in election campaigns for parties and politicians. Yet comparatively little research has compared how these actors use these platforms during and outs...
doi.org
November 7, 2025 at 1:31 PM
Reposted by Johannes B. Gruber
Reposted by Johannes B. Gruber
Here is the first piece in a series of short articles I'm doing about the DSA and researcher access to publicly available information.

It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/

verfassungsblog.de/dsa-platform...
Using the DSA to Study Platforms
verfassungsblog.de
October 27, 2025 at 2:32 PM
Reposted by Johannes B. Gruber
The EU’s Digital Services Act (DSA) sets important rules for research using publicly available platform data. But who benefits from its protections?

DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:

verfassungsblog.de/dsa-platform...
October 27, 2025 at 8:04 AM
Reposted by Johannes B. Gruber
Cool paper by @eddieyang.bsky.social, confirming our LLM hacking findings (arxiv.org/abs/2509.08825):
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible
October 21, 2025 at 8:02 AM
One good thing about developing software is that you can keep your own needs in mind. Like when you can never remember your username and use it as the example value 😅 #rstats
October 16, 2025 at 2:58 PM
Academic life hack: check which papers AI hallucinated most often and write them 🚀🚀🚀
And here we go. I never wrote this article, and yet it is cited here.

www.liberalbriefs.com/geopolitics/...

And of course, it sounds so plausible, I seriously checked whether I had forgotten it, or the footnote was slightly wrong.

#AIisnotresearch
October 7, 2025 at 7:12 PM
Reposted by Johannes B. Gruber
Social-Media-Daten zwischen Forschung und Infrastrukturen - nachhaltige Archivierung, Erschließung und Bereitstellung: An der @dnb-aktuelles.bsky.social finden vom 17.-19.03.2026 die Social Media Access Days statt. Wir freuen uns über Einreichungen bis zum 31.10.2025. www.dnb.de/DE/Professio...
Call for Submissions: Social Media Access Days
Call for Submissions: Social Media Access Days
www.dnb.de
October 1, 2025 at 6:29 AM
Reposted by Johannes B. Gruber
#AmCAT is proudly developed by the @societal-analytics.nl

You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...
Day 2 of the #MEDemConference at @gesis.org starts with powerful tool demos:
🔍 AmCAT @sof14g1l.bsky.social on enabling large-scale text analysis of media & political debates.
🌐 HarDIS @sziaja.bsky.social on harmonizing and sustaining cross-national democracy data (surveys, parties, experts).
September 30, 2025 at 9:52 AM
@sebstier.bsky.social at #MEDem Conf: computational research of democracy stands in the shoulders of the few enthusiasts who create datasets, software and infrastructure for it. How can we move forward? Short answer: more collaboration & sharing!
September 30, 2025 at 12:28 PM
@simonsaysnothin.bsky.social at #MEDem Conf: we need to integrate our efforts instead of researchers all building their own datasets and infrastructure. Couldn't agree more!
September 29, 2025 at 11:49 AM
Reposted by Johannes B. Gruber
The "validate, validate, validate" (GRIMMER, 2014) principle of Text Analysis/NLP never gets old.
September 28, 2025 at 1:02 AM
Bluesky is not just a clone of the old Twitter. It's meant to look and feel like it to popularise a version of social media with a fundamental difference to the big platforms: its infrastructure is open.

Nice write up of that background: overreacted.io/open-social/
Open Social — overreacted
The protocol is the API.
overreacted.io
September 27, 2025 at 7:48 AM
Reposted by Johannes B. Gruber
Wanna know more about #data #access and the Digital Services Act? Here’s our latest policy paper about how it works👇

www.weizenbaum-library.de/items/86842c...

#commsky #polisky #dsa @weizenbauminstitut.bsky.social
September 26, 2025 at 5:53 AM
Reposted by Johannes B. Gruber
❗️Our next workshop will be on October 2nd, 6 pm CEST, on Effective and Useful Feature engineering by @emilhvitfeldt.bsky.social

Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
September 26, 2025 at 8:32 AM
Reposted by Johannes B. Gruber
Coming up on Monday the @medem.bsky.social conference at @gesis.org in Cologne. Stay tuned for the future of democracy research infrastructures www.medem.eu/coming-up-th... Keynotes from @simonsaysnothin.bsky.social and @sldelange.bsky.social
Coming Up: The 2025 MEDem Conference & Workshop! - Monitoring Electoral Democracy
Coming Up: the 2025 medem Conference! We are thrilled for the upcoming 3rd MEDem Conference, scheduled to take place from September 29-30 at GESIS in Cologne!The 3rd MEDem conference will bring togeth...
www.medem.eu
September 23, 2025 at 8:57 AM
"acknowledging LLM contributions is key to maintaining transparency and ethical standards in academic publishing"

Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.
September 22, 2025 at 2:28 PM
Just wanted to share this Google Scholar trick: I often have the problem that I want to find papers using certain computational methods, but specifically in my own field (for lit reviews).

You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.
September 22, 2025 at 8:03 AM
Reposted by Johannes B. Gruber
The new ggplot2 4.0.0 now supports absolute plot dimensions 🤩

#rstats #dataviz #phd
September 18, 2025 at 6:28 PM
Reposted by Johannes B. Gruber
Find us Sep 22.-26. at the #DGS2025 Conference, Campus Duisburg.
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData
September 21, 2025 at 10:17 AM
Reposted by Johannes B. Gruber
Which Canadian MPs are on Bluesky and what do they post?

My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency

Read more: doi.org/10.1017/S000...

#polsky #commsky #cdnpoli
September 4, 2025 at 2:03 PM
If you feel uneasy using LLMs for data annotation, you are right (if not, you should). It offers new chances for research that is difficult with traditional #NLP/#textasdata methods, but the risk of false conclusions is high!

Experiment + *evidence-based* mitigation strategies in this preprint 👇
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
September 15, 2025 at 1:05 PM