Manoel Horta Ribeiro
banner
manoelhortaribeiro.bsky.social
Manoel Horta Ribeiro
@manoelhortaribeiro.bsky.social
Assistant Professor @ Princeton

Previously: EPFL 🇨🇭, UFMG 🇧🇷

Interests: Computational Social Science, Platforms, GenAI, Moderation
I argue that if we consider these three points, we find that labeling with LLMs is neither trick nor treat. Treated as measurement instruments, their value lies in forcing us to confront uncertainty we once ignored; not in completely eliminating it.
October 25, 2025 at 6:29 PM
I cluster work in this area broadly into three waves: the “wow” phase (e.g., Gillardi’s PNAS paper), the “how do we do this right?” phase (e.g., Egami’s DSL), and the “the boat is on fire” wave (e.g., Baumann’s LM hacking).
October 25, 2025 at 6:29 PM
Large language models are quietly transforming how social scientists label data. In dozens of new studies, undergrad coders and Turkers have been replaced by GPT-5 or Gemini 2.5 (or whatever new model just arrived). What began as a convenience is becoming a methodological shift.
October 25, 2025 at 6:29 PM
None of this is `hard'—great material already exists (Brady Neal on causality, Moritz Hardt on benchmarks, etc.). What's missing is mindset: causality, regression, and experimental design must become core to how we train computer scientists—not optional extras.
October 5, 2025 at 4:07 PM
I elaborate on what I think should be taught. It boils down to (at least) four things:
1 causality: how to pose and identify effects
2 regression: as a tool for inference, not prediction
3 benchmarks: as measurements, not trophies
4 experiments: with rigor, power, and ethics
October 5, 2025 at 4:07 PM
Success is measured by benchmarks, not by robustness or causal clarity. Yet more and more papers now make causal claims --- from HCI to NLP, ML to Security and Privacy.
October 5, 2025 at 4:07 PM
Why the contrast? Because the two fields treat empiricism in opposite ways.

Econometrics was forged in the crucible of skepticism. Every paper is a defensive war against omitted variables, selection bias, etc. Yet, CS (and ML) was built on demonstration, not falsification ...
October 5, 2025 at 4:07 PM
I'd posit a similar, flipped version of the law for ML:

> When an economist reads (and understands) an empirical machine learning study done after 2022, the probability that they will think of an objection that the researcher has failed to take into account is close to one.
October 5, 2025 at 4:07 PM
Henderson’s first law of econometrics reads:

> When you read an econometric study done after 2005, the probability that the researcher has failed to take into account an objection that a non-economist will think of is close to zero.
October 5, 2025 at 4:07 PM
(As in, a reasonable cost for us, we'd be happy to host it for research purposes)
September 16, 2025 at 7:23 PM
We are planning to, although we need to improve the current system to make it scalable (at a reasonable cost) 😅
September 16, 2025 at 7:23 PM
Learned a lot in this project working with @omelmalki.bsky.social, @andresmh.com @mariannealq.bsky.social!

This work was inspired by a swathe of excellent work reimagining social media by @jonathanstray.bsky.social @mbernst.bsky.social @tiziano.bsky.social @micahcarroll.bsky.social, and others
September 16, 2025 at 1:24 PM
Bonsai is modular and platform-agnostic, opening paths for integration beyond Bluesky. The paper details the backend design, study, and implications: arxiv.org/abs/2509.10776
September 16, 2025 at 1:24 PM
So what? Designing around intent gives users greater agency and alignment, but also increases curation effort. Future systems should pair transparent pipelines with lightweight interfaces to make intentional feedbuilding practical and sustainable.
September 16, 2025 at 1:24 PM
Participants highlighted the tradeoff between agency and convenience, suggesting that greater control often comes with higher cognitive/interactional costs. Some participants described Bonsai as “a feed that finally matched what I came here for,” highlighting the promise of intentional feedbuilding!
September 16, 2025 at 1:24 PM
Participants used Bonsai to find content aligned with their goals, filter out noise, and separate engagement from intent—transforming feeds into tools for research, connection, or focus rather than distraction. At the same time, intentional curation demanded more effort than passive scrolling.
September 16, 2025 at 1:24 PM
We implemented Bonsai on Bluesky and conducted a two-phase, multi-week study with 15 participants. This deployment allowed us to observe how people used intentional feedbuilding in practice, and how it compared to their experiences with engagement-driven defaults.
September 16, 2025 at 1:24 PM