Lightnews — Scholar-powered news

Reposted by Johannes B. Gruber

Jula Luehring

@julaluehring.bsky.social

Misinformation research has a causality problem: lab experiments are limited; observational studies confounded.

We used causal inference on 9.9M tweets, quantifying effects in the wild while blocking backdoor paths.

Does misinfo get higher engagement? Are following discussions more emotional? 🧵

OSF

osf.io

November 11, 2025 at 9:59 AM

Johannes B. Gruber

@jbgruber.bsky.social

Very happy to update the {traktok} #rstats readme. After 1.5 years, you can finally search TikTok again without access to the Research API. It's slow and a bit clunky, but it works! Thanks, @michaelgoodier.bsky.social for the crucial hint!

November 11, 2025 at 8:38 AM

Reposted by Johannes B. Gruber

Mike Cowburn

@mikecowburn.bsky.social

🚨🎉New Publication Friday 🎉🚨

Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians

w/ @ulrikeklinger.bsky.social & @andersoloflarsson.bsky.social

Out now in Political Communication.

#polisky #commsky

doi.org/10.1080/1058...

Campaigning in the Age of Platforms: A Longitudinal Analysis of German Parties & Politicians

Social media platforms now play a central role in election campaigns for parties and politicians. Yet comparatively little research has compared how these actors use these platforms during and outs...

doi.org

November 7, 2025 at 1:31 PM

Reposted by Johannes B. Gruber

Johannes Breuer

@johannesbreuer.com

"dplyr but make it bussin fr fr no cap"
hadley.github.io/genzplyr/

dplyr but make it bussin fr fr no cap

`genzplyr` is an alternative syntax for `dplyr` that replaces boring old function names with GenZ slang. Your data wrangling is about to hit different.

hadley.github.io

November 8, 2025 at 10:24 AM

Reposted by Johannes B. Gruber

Daphne Keller

@daphnek.bsky.social

Here is the first piece in a series of short articles I'm doing about the DSA and researcher access to publicly available information.

It focuses on categories of researchers under the DSA, and what data they are each authorized to use. 1/

verfassungsblog.de/dsa-platform...

Using the DSA to Study Platforms

verfassungsblog.de

October 27, 2025 at 2:32 PM

Reposted by Johannes B. Gruber

Verfassungsblog

@verfassungsblog.de

The EU’s Digital Services Act (DSA) sets important rules for research using publicly available platform data. But who benefits from its protections?

DAPHNE KELLER argues that while the DSA is an important opportunity, key questions remain unresolved:

verfassungsblog.de/dsa-platform...

Quote from author Daphne Keller in our published article: "Scraping, in particular, is uniquely useful for investigating what platforms actually show their users - not just what they claim to show them."

October 27, 2025 at 8:04 AM

Reposted by Johannes B. Gruber

Joachim Baumann

@joachimbaumann.bsky.social

Cool paper by @eddieyang.bsky.social, confirming our LLM hacking findings (arxiv.org/abs/2509.08825):
✓ LLMs are brittle data annotators
✓ Downstream conclusions flip frequently: LLM hacking risk is real!
✓ Bias correction methods can help but have trade-offs
✓ Use human expert whenever possible

October 21, 2025 at 8:02 AM

Johannes B. Gruber

@jbgruber.bsky.social

One good thing about developing software is that you can keep your own needs in mind. Like when you can never remember your username and use it as the example value 😅 #rstats

October 16, 2025 at 2:58 PM

Johannes B. Gruber

@jbgruber.bsky.social

Academic life hack: check which papers AI hallucinated most often and write them 🚀🚀🚀

Ulrike Franke @rikefranke.bsky.social · Oct 7

And here we go. I never wrote this article, and yet it is cited here.

www.liberalbriefs.com/geopolitics/...

And of course, it sounds so plausible, I seriously checked whether I had forgotten it, or the footnote was slightly wrong.

#AIisnotresearch

October 7, 2025 at 7:12 PM

Reposted by Johannes B. Gruber

Katrin Weller

@kwelle.bsky.social

Social-Media-Daten zwischen Forschung und Infrastrukturen - nachhaltige Archivierung, Erschließung und Bereitstellung: An der @dnb-aktuelles.bsky.social finden vom 17.-19.03.2026 die Social Media Access Days statt. Wir freuen uns über Einreichungen bis zum 31.10.2025. www.dnb.de/DE/Professio...

Call for Submissions: Social Media Access Days

www.dnb.de

October 1, 2025 at 6:29 AM

Reposted by Johannes B. Gruber

Societal Analytics Lab

@societal-analytics.nl

#AmCAT is proudly developed by the @societal-analytics.nl

You can learn more about it in the:
* Book: amcat.nl/book/
* Blog post: societal-analytics.nl/blogs/202501...

MEDem @medem.bsky.social · Sep 30

Day 2 of the #MEDemConference at @gesis.org starts with powerful tool demos:
🔍 AmCAT @sof14g1l.bsky.social on enabling large-scale text analysis of media & political debates.
🌐 HarDIS @sziaja.bsky.social on harmonizing and sustaining cross-national democracy data (surveys, parties, experts).

Sofia Gil-Clavel stands at a podium presenting AmCAT at the 3rd MEDem Conference. Behind her, a slide shows the AmCAT team (Kasper Welbers, Wouter van Atteveldt, Johannes Gruber, Sofia Gil-Clavel) with the tagline: “Developed by researchers for researchers, society, and data savvy users.” Logos of MEDem, VU Amsterdam, and the Societal Analytics Lab are displayed at the top.

Sebastian Ziaja stands at a podium presenting HarDIS (Harmony in the Democratic Ideological Space) at the 3rd MEDem Conference. A slide behind him shows the HarDIS team (Lea Kaftan, Paul Bederke, Selçuk Timur Uluer) with the logos of MEDem, GESIS Leibniz Institute for the Social Sciences, and OSCARS (the funding initiative).

September 30, 2025 at 9:52 AM

Johannes B. Gruber

@jbgruber.bsky.social

@sebstier.bsky.social at #MEDem Conf: computational research of democracy stands in the shoulders of the few enthusiasts who create datasets, software and infrastructure for it. How can we move forward? Short answer: more collaboration & sharing!

How to move forward

Collaborate on improving data coverage
Filling gaps in poliitical text orpora
Collecting online platform data via APls, webscraping and the Digital Services Act
Share open-source software, R packages and infrastructure components
Improve conditions for data sharing
Critically evaluate and improve the application of Al
Requires collaboration and funding at the European level

September 30, 2025 at 12:28 PM

Johannes B. Gruber

@jbgruber.bsky.social

@simonsaysnothin.bsky.social at #MEDem Conf: we need to integrate our efforts instead of researchers all building their own datasets and infrastructure. Couldn't agree more!

Integrate

○ Researchers get lost in building fragmented data
infrastructures (case in point: me).
○ specialized collection is fine, but fragmented
dissemination hinders use.
○ Lack of integration blocks comparative research.

Whot I'd like MEDem to build towards

○ Incentivization of shared infrastructures
○ Toolkits to standardize data collection (measurement and interoperability)

September 29, 2025 at 11:49 AM

Reposted by Johannes B. Gruber

Alisson Soares

@alissonmasoares.bsky.social

The "validate, validate, validate" (GRIMMER, 2014) principle of Text Analysis/NLP never gets old.

September 28, 2025 at 1:02 AM

Johannes B. Gruber

@jbgruber.bsky.social

Bluesky is not just a clone of the old Twitter. It's meant to look and feel like it to popularise a version of social media with a fundamental difference to the big platforms: its infrastructure is open.

Nice write up of that background: overreacted.io/open-social/

Open Social — overreacted

The protocol is the API.

overreacted.io

September 27, 2025 at 7:48 AM

Reposted by Johannes B. Gruber

Ulrike Klinger

@ulrikeklinger.bsky.social

Wanna know more about #data #access and the Digital Services Act? Here’s our latest policy paper about how it works👇

www.weizenbaum-library.de/items/86842c...

#commsky #polisky #dsa @weizenbauminstitut.bsky.social

September 26, 2025 at 5:53 AM

Reposted by Johannes B. Gruber

Dariia Mykhailyshyna

@dariia.bsky.social

❗️Our next workshop will be on October 2nd, 6 pm CEST, on Effective and Useful Feature engineering by @emilhvitfeldt.bsky.social

Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats

September 26, 2025 at 8:32 AM

Reposted by Johannes B. Gruber

Alexia Katsanidou

@alexiakatsanidou.bsky.social

Coming up on Monday the @medem.bsky.social conference at @gesis.org in Cologne. Stay tuned for the future of democracy research infrastructures www.medem.eu/coming-up-th... Keynotes from @simonsaysnothin.bsky.social and @sldelange.bsky.social

Coming Up: The 2025 MEDem Conference & Workshop! - Monitoring Electoral Democracy

Coming Up: the 2025 medem Conference! We are thrilled for the upcoming 3rd MEDem Conference, scheduled to take place from September 29-30 at GESIS in Cologne!The 3rd MEDem conference will bring togeth...

www.medem.eu

September 23, 2025 at 8:57 AM

Johannes B. Gruber

@jbgruber.bsky.social

"acknowledging LLM contributions is key to maintaining transparency and ethical standards in academic publishing"

Why though? Acknowledging the use of LLMs only dilutes responsibility. Authors are responsible for everything in an article. And if it's fake/plagiarised, authors are responsible.

Kai Arzheimer @kai-arzheimer.com · Sep 22

I smell some social desirability bias. Also, who acknowledges their (overly wordy) spell checker?

What do researchers acknowledge ChatGPT for in their papers? - Impact of Social Sciences

A new study finds LLMs to be acknowledged for only a narrow set of academic tasks.

blogs.lse.ac.uk

September 22, 2025 at 2:28 PM

Johannes B. Gruber

@jbgruber.bsky.social

Just wanted to share this Google Scholar trick: I often have the problem that I want to find papers using certain computational methods, but specifically in my own field (for lit reviews).

You can do that by limiting the search to certain sources. My (imperfect) collection in the alt text.

"BERT" AND "multilingual" source:"Digital Journalism" OR source:"Digital Journalism" OR source:"International Journal of Press/Politics" OR source:"Journal of Communication" OR source:"New Media and Society" OR source:"Communication Methods and Measures" OR source:"Communication Research" OR source:"Journal of Computer-Mediated Communication" OR source:"Big Data and Society" OR source:"Political Communication" OR source:"Social Media and Society" OR source:"Computational Communication Research"

September 22, 2025 at 8:03 AM

Reposted by Johannes B. Gruber

Jan Broder Engler

@jbengler.de

The new ggplot2 4.0.0 now supports absolute plot dimensions 🤩

#rstats #dataviz #phd

library(tidyverse)

mtcars |>
head(8) |>
rownames_to_column("name") |>
ggplot(aes(x = drat, y = name, fill = name)) +
geom_col() +
theme(panel.widths = unit(50, "mm"), panel.heights = unit(50, "mm"))

September 18, 2025 at 6:28 PM

Reposted by Johannes B. Gruber

dprex.bsky.social

@dprex.bsky.social

Find us Sep 22.-26. at the #DGS2025 Conference, Campus Duisburg.
At the @gesis.org stand we present DP-R|EX – the Data Portal for Right-Wing & Extremism Data.
Let’s talk about sharing data for reuse, data management & hate speech!
👉info: datenportal-rechtsextremismus.de #ResearchData #ExtremismData

September 21, 2025 at 10:17 AM

Reposted by Johannes B. Gruber

Inessa De Angelis

@inessadeangelis.bsky.social

Which Canadian MPs are on Bluesky and what do they post?

My new paper w/ @rohanalexander.bsky.social in @cjps-rcsp.bsky.social unpacks these questions, finding MPs
use it like Twitter to discuss policy, the Ottawa bubble & constituency

Read more: doi.org/10.1017/S000...

#polsky #commsky #cdnpoli

What are Canadian Members of Parliament Doing on Bluesky? research note abstract

September 4, 2025 at 2:03 PM

Johannes B. Gruber

@jbgruber.bsky.social

If you feel uneasy using LLMs for data annotation, you are right (if not, you should). It offers new chances for research that is difficult with traditional #NLP/#textasdata methods, but the risk of false conclusions is high!

Experiment + *evidence-based* mitigation strategies in this preprint 👇

Joachim Baumann @joachimbaumann.bsky.social · Sep 12

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825

$We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.$

September 15, 2025 at 1:05 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news