Johannes B. Gruber
@jbgruber.bsky.social
Senior Researcher @gesis.org // Data Editor @polcommjournal.bsky.social
🔎 political communication (#polsky + #commsky) with text analysis and #rstats (#opendata + #openscience)
🌏 JohannesBGruber.eu
👨💻 research software github.com/JBGruber
🔎 political communication (#polsky + #commsky) with text analysis and #rstats (#opendata + #openscience)
🌏 JohannesBGruber.eu
👨💻 research software github.com/JBGruber
Pinned
Some big personal/professional news: starting next month, I will be leading a team in the Data Services for the Social Sciences department at @gesis.org (in Cologne)!
Very happy to update the {traktok} #rstats readme. After 1.5 years, you can finally search TikTok again without access to the Research API. It's slow and a bit clunky, but it works! Thanks, @michaelgoodier.bsky.social for the crucial hint!
November 11, 2025 at 8:38 AM
Very happy to update the {traktok} #rstats readme. After 1.5 years, you can finally search TikTok again without access to the Research API. It's slow and a bit clunky, but it works! Thanks, @michaelgoodier.bsky.social for the crucial hint!
Reposted by Johannes B. Gruber
🚨🎉New Publication Friday 🎉🚨
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social
Out now in Political Communication.
#polisky #commsky
doi.org/10.1080/1058...
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social
Out now in Political Communication.
#polisky #commsky
doi.org/10.1080/1058...
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
Social media platforms now play a central role in election campaigns for parties and politicians. Yet comparatively little research has compared how these actors use these platforms during and outs...
doi.org
November 7, 2025 at 1:31 PM
🚨🎉New Publication Friday 🎉🚨
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social
Out now in Political Communication.
#polisky #commsky
doi.org/10.1080/1058...
Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians
w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social
Out now in Political Communication.
#polisky #commsky
doi.org/10.1080/1058...
Reposted by Johannes B. Gruber
"dplyr but make it bussin fr fr no cap"
hadley.github.io/genzplyr/
hadley.github.io/genzplyr/
dplyr but make it bussin fr fr no cap
`genzplyr` is an alternative syntax for `dplyr` that replaces boring old function names with GenZ slang. Your data wrangling is about to hit different.
hadley.github.io
November 8, 2025 at 10:24 AM
"dplyr but make it bussin fr fr no cap"
hadley.github.io/genzplyr/
hadley.github.io/genzplyr/
Reposted by Johannes B. Gruber
Here is the first piece in a series of short articles I'm doing about the DSA and researcher access to publicly available information.
It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/
verfassungsblog.de/dsa-platform...
It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/
verfassungsblog.de/dsa-platform...
Using the DSA to Study Platforms
verfassungsblog.de
October 27, 2025 at 2:32 PM
Here is the first piece in a series of short articles I'm doing about the DSA and researcher access to publicly available information.
It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/
verfassungsblog.de/dsa-platform...
It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/
verfassungsblog.de/dsa-platform...
Reposted by Johannes B. Gruber
The EU’s Digital Services Act (DSA) sets important rules for research using publicly available platform data. But who benefits from its protections?
DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:
verfassungsblog.de/dsa-platform...
DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:
verfassungsblog.de/dsa-platform...
October 27, 2025 at 8:04 AM
The EU’s Digital Services Act (DSA) sets important rules for research using publicly available platform data. But who benefits from its protections?
DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:
verfassungsblog.de/dsa-platform...
DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:
verfassungsblog.de/dsa-platform...
Reposted by Johannes B. Gruber
Cool paper by @eddieyang.bsky.social, confirming our LLM hacking findings (arxiv.org/abs/2509.08825):
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible
October 21, 2025 at 8:02 AM
Cool paper by @eddieyang.bsky.social, confirming our LLM hacking findings (arxiv.org/abs/2509.08825):
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible
One good thing about developing software is that you can keep your own needs in mind. Like when you can never remember your username and use it as the example value 😅 #rstats
October 16, 2025 at 2:58 PM
One good thing about developing software is that you can keep your own needs in mind. Like when you can never remember your username and use it as the example value 😅 #rstats
Academic life hack: check which papers AI hallucinated most often and write them 🚀🚀🚀
And here we go. I never wrote this article, and yet it is cited here.
www.liberalbriefs.com/geopolitics/...
And of course, it sounds so plausible, I seriously checked whether I had forgotten it, or the footnote was slightly wrong.
#AIisnotresearch
www.liberalbriefs.com/geopolitics/...
And of course, it sounds so plausible, I seriously checked whether I had forgotten it, or the footnote was slightly wrong.
#AIisnotresearch
October 7, 2025 at 7:12 PM
Academic life hack: check which papers AI hallucinated most often and write them 🚀🚀🚀
Reposted by Johannes B. Gruber
Social-Media-Daten zwischen Forschung und Infrastrukturen - nachhaltige Archivierung, Erschließung und Bereitstellung: An der @dnb-aktuelles.bsky.social finden vom 17.-19.03.2026 die Social Media Access Days statt. Wir freuen uns über Einreichungen bis zum 31.10.2025. www.dnb.de/DE/Professio...
Call for Submissions: Social Media Access Days
Call for Submissions: Social Media Access Days
www.dnb.de
October 1, 2025 at 6:29 AM
Social-Media-Daten zwischen Forschung und Infrastrukturen - nachhaltige Archivierung, Erschließung und Bereitstellung: An der @dnb-aktuelles.bsky.social finden vom 17.-19.03.2026 die Social Media Access Days statt. Wir freuen uns über Einreichungen bis zum 31.10.2025. www.dnb.de/DE/Professio...
Reposted by Johannes B. Gruber
#AmCAT is proudly developed by the @societal-analytics.nl
You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...
You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...
Day 2 of the #MEDemConference at @gesis.org starts with powerful tool demos:
🔍 AmCAT @sof14g1l.bsky.social on enabling large-scale text analysis of media & political debates.
🌐 HarDIS @sziaja.bsky.social on harmonizing and sustaining cross-national democracy data (surveys, parties, experts).
🔍 AmCAT @sof14g1l.bsky.social on enabling large-scale text analysis of media & political debates.
🌐 HarDIS @sziaja.bsky.social on harmonizing and sustaining cross-national democracy data (surveys, parties, experts).
September 30, 2025 at 9:52 AM
#AmCAT is proudly developed by the @societal-analytics.nl
You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...
You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...
@sebstier.bsky.social at #MEDem Conf: computational research of democracy stands in the shoulders of the few enthusiasts who create datasets, software and infrastructure for it. How can we move forward? Short answer: more collaboration & sharing!
September 30, 2025 at 12:28 PM
@sebstier.bsky.social at #MEDem Conf: computational research of democracy stands in the shoulders of the few enthusiasts who create datasets, software and infrastructure for it. How can we move forward? Short answer: more collaboration & sharing!
@simonsaysnothin.bsky.social at #MEDem Conf: we need to integrate our efforts instead of researchers all building their own datasets and infrastructure. Couldn't agree more!
September 29, 2025 at 11:49 AM
@simonsaysnothin.bsky.social at #MEDem Conf: we need to integrate our efforts instead of researchers all building their own datasets and infrastructure. Couldn't agree more!
Reposted by Johannes B. Gruber
The "validate, validate, validate" (GRIMMER, 2014) principle of Text Analysis/NLP never gets old.
September 28, 2025 at 1:02 AM
The "validate, validate, validate" (GRIMMER, 2014) principle of Text Analysis/NLP never gets old.
Bluesky is not just a clone of the old Twitter. It's meant to look and feel like it to popularise a version of social media with a fundamental difference to the big platforms: its infrastructure is open.
Nice write up of that background: overreacted.io/open-social/
Nice write up of that background: overreacted.io/open-social/
Open Social — overreacted
The protocol is the API.
overreacted.io
September 27, 2025 at 7:48 AM
Bluesky is not just a clone of the old Twitter. It's meant to look and feel like it to popularise a version of social media with a fundamental difference to the big platforms: its infrastructure is open.
Nice write up of that background: overreacted.io/open-social/
Nice write up of that background: overreacted.io/open-social/
Reposted by Johannes B. Gruber
Wanna know more about #data #access and the Digital Services Act? Here’s our latest policy paper about how it works👇
www.weizenbaum-library.de/items/86842c...
#commsky #polisky #dsa @weizenbauminstitut.bsky.social
www.weizenbaum-library.de/items/86842c...
#commsky #polisky #dsa @weizenbauminstitut.bsky.social
September 26, 2025 at 5:53 AM
Wanna know more about #data #access and the Digital Services Act? Here’s our latest policy paper about how it works👇
www.weizenbaum-library.de/items/86842c...
#commsky #polisky #dsa @weizenbauminstitut.bsky.social
www.weizenbaum-library.de/items/86842c...
#commsky #polisky #dsa @weizenbauminstitut.bsky.social
Reposted by Johannes B. Gruber
❗️Our next workshop will be on October 2nd, 6 pm CEST, on Effective and Useful Feature engineering by @emilhvitfeldt.bsky.social
Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
September 26, 2025 at 8:32 AM
❗️Our next workshop will be on October 2nd, 6 pm CEST, on Effective and Useful Feature engineering by @emilhvitfeldt.bsky.social
Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
Reposted by Johannes B. Gruber
Coming up on Monday the @medem.bsky.social conference at @gesis.org in Cologne. Stay tuned for the future of democracy research infrastructures www.medem.eu/coming-up-th... Keynotes from @simonsaysnothin.bsky.social and @sldelange.bsky.social
Coming Up: The 2025 MEDem Conference & Workshop! - Monitoring Electoral Democracy
Coming Up: the 2025 medem Conference! We are thrilled for the upcoming 3rd MEDem Conference, scheduled to take place from September 29-30 at GESIS in Cologne!The 3rd MEDem conference will bring togeth...
www.medem.eu
September 23, 2025 at 8:57 AM
Coming up on Monday the @medem.bsky.social conference at @gesis.org in Cologne. Stay tuned for the future of democracy research infrastructures www.medem.eu/coming-up-th... Keynotes from @simonsaysnothin.bsky.social and @sldelange.bsky.social
"acknowledging LLM contributions is key to maintaining transparency and ethical standards in academic publishing"
Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.
Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.
I smell some social desirability bias. Also, who acknowledges their (overly wordy) spell checker?
What do researchers acknowledge ChatGPT for in their papers? - Impact of Social Sciences
A new study finds LLMs to be acknowledged for only a narrow set of academic tasks.
blogs.lse.ac.uk
September 22, 2025 at 2:28 PM
"acknowledging LLM contributions is key to maintaining transparency and ethical standards in academic publishing"
Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.
Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.
Just wanted to share this Google Scholar trick: I often have the problem that I want to find papers using certain computational methods, but specifically in my own field (for lit reviews).
You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.
You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.
September 22, 2025 at 8:03 AM
Just wanted to share this Google Scholar trick: I often have the problem that I want to find papers using certain computational methods, but specifically in my own field (for lit reviews).
You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.
You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.
Reposted by Johannes B. Gruber
September 18, 2025 at 6:28 PM
Reposted by Johannes B. Gruber
Find us Sep 22.-26. at the #DGS2025 Conference, Campus Duisburg.
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData
September 21, 2025 at 10:17 AM
Find us Sep 22.-26. at the #DGS2025 Conference, Campus Duisburg.
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData
Reposted by Johannes B. Gruber
Which Canadian MPs are on Bluesky and what do they post?
My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency
Read more: doi.org/10.1017/S000...
#polsky #commsky #cdnpoli
My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency
Read more: doi.org/10.1017/S000...
#polsky #commsky #cdnpoli
September 4, 2025 at 2:03 PM
Which Canadian MPs are on Bluesky and what do they post?
My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency
Read more: doi.org/10.1017/S000...
#polsky #commsky #cdnpoli
My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency
Read more: doi.org/10.1017/S000...
#polsky #commsky #cdnpoli
If you feel uneasy using LLMs for data annotation, you are right (if not, you should). It offers new chances for research that is difficult with traditional #NLP/#textasdata methods, but the risk of false conclusions is high!
Experiment + *evidence-based* mitigation strategies in this preprint 👇
Experiment + *evidence-based* mitigation strategies in this preprint 👇
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.
Paper: arxiv.org/pdf/2509.08825
Paper: arxiv.org/pdf/2509.08825
September 15, 2025 at 1:05 PM
If you feel uneasy using LLMs for data annotation, you are right (if not, you should). It offers new chances for research that is difficult with traditional #NLP/#textasdata methods, but the risk of false conclusions is high!
Experiment + *evidence-based* mitigation strategies in this preprint 👇
Experiment + *evidence-based* mitigation strategies in this preprint 👇
Reposted by Johannes B. Gruber
Are you a CSS researcher using LLMs for text annotation tasks? Do you integrate the results into downstream statistical analyses?
Turns out, even with SOTA models you have a 31-50% chance of coming to wrong conclusions this way!
Learn more about this and mitigation strategies in our new preprint 👇
Turns out, even with SOTA models you have a 31-50% chance of coming to wrong conclusions this way!
Learn more about this and mitigation strategies in our new preprint 👇
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.
Paper: arxiv.org/pdf/2509.08825
Paper: arxiv.org/pdf/2509.08825
September 12, 2025 at 11:12 AM
Are you a CSS researcher using LLMs for text annotation tasks? Do you integrate the results into downstream statistical analyses?
Turns out, even with SOTA models you have a 31-50% chance of coming to wrong conclusions this way!
Learn more about this and mitigation strategies in our new preprint 👇
Turns out, even with SOTA models you have a 31-50% chance of coming to wrong conclusions this way!
Learn more about this and mitigation strategies in our new preprint 👇