Lightnews — Scholar-powered news

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

1.2K followers 380 following 160 posts

Assistant Professor @ Princeton

Previously: EPFL 🇨🇭, UFMG 🇧🇷

Interests: Computational Social Science, Platforms, GenAI, Moderation

Posts Replies Media Videos

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

I argue that if we consider these three points, we find that labeling with LLMs is neither trick nor treat. Treated as measurement instruments, their value lies in forcing us to confront uncertainty we once ignored; not in completely eliminating it.

October 25, 2025 at 6:29 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

I cluster work in this area broadly into three waves: the “wow” phase (e.g., Gillardi’s PNAS paper), the “how do we do this right?” phase (e.g., Egami’s DSL), and the “the boat is on fire” wave (e.g., Baumann’s LM hacking).

October 25, 2025 at 6:29 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Large language models are quietly transforming how social scientists label data. In dozens of new studies, undergrad coders and Turkers have been replaced by GPT-5 or Gemini 2.5 (or whatever new model just arrived). What began as a convenience is becoming a methodological shift.

October 25, 2025 at 6:29 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

None of this is `hard'—great material already exists (Brady Neal on causality, Moritz Hardt on benchmarks, etc.). What's missing is mindset: causality, regression, and experimental design must become core to how we train computer scientists—not optional extras.

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

I elaborate on what I think should be taught. It boils down to (at least) four things:
1 causality: how to pose and identify effects
2 regression: as a tool for inference, not prediction
3 benchmarks: as measurements, not trophies
4 experiments: with rigor, power, and ethics

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Success is measured by benchmarks, not by robustness or causal clarity. Yet more and more papers now make causal claims --- from HCI to NLP, ML to Security and Privacy.

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Why the contrast? Because the two fields treat empiricism in opposite ways.

Econometrics was forged in the crucible of skepticism. Every paper is a defensive war against omitted variables, selection bias, etc. Yet, CS (and ML) was built on demonstration, not falsification ...

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

I'd posit a similar, flipped version of the law for ML:

> When an economist reads (and understands) an empirical machine learning study done after 2022, the probability that they will think of an objection that the researcher has failed to take into account is close to one.

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Henderson’s first law of econometrics reads:

> When you read an econometric study done after 2005, the probability that the researcher has failed to take into account an objection that a non-economist will think of is close to zero.

October 5, 2025 at 4:07 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

(As in, a reasonable cost for us, we'd be happy to host it for research purposes)

September 16, 2025 at 7:23 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

We are planning to, although we need to improve the current system to make it scalable (at a reasonable cost) 😅

September 16, 2025 at 7:23 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Learned a lot in this project working with @omelmalki.bsky.social, @andresmh.com @mariannealq.bsky.social!

This work was inspired by a swathe of excellent work reimagining social media by @jonathanstray.bsky.social @mbernst.bsky.social @tiziano.bsky.social @micahcarroll.bsky.social, and others

September 16, 2025 at 1:24 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Bonsai is modular and platform-agnostic, opening paths for integration beyond Bluesky. The paper details the backend design, study, and implications: arxiv.org/abs/2509.10776

September 16, 2025 at 1:24 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

So what? Designing around intent gives users greater agency and alignment, but also increases curation effort. Future systems should pair transparent pipelines with lightweight interfaces to make intentional feedbuilding practical and sustainable.

September 16, 2025 at 1:24 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Participants highlighted the tradeoff between agency and convenience, suggesting that greater control often comes with higher cognitive/interactional costs. Some participants described Bonsai as “a feed that finally matched what I came here for,” highlighting the promise of intentional feedbuilding!

September 16, 2025 at 1:24 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

Participants used Bonsai to find content aligned with their goals, filter out noise, and separate engagement from intent—transforming feeds into tools for research, connection, or focus rather than distraction. At the same time, intentional curation demanded more effort than passive scrolling.

September 16, 2025 at 1:24 PM

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

We implemented Bonsai on Bluesky and conducted a two-phase, multi-week study with 15 participants. This deployment allowed us to observe how people used intentional feedbuilding in practice, and how it compared to their experiences with engagement-driven defaults.

September 16, 2025 at 1:24 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news