Lightnews — Scholar-powered news

Arij Riabi

@arijriabi.bsky.social

120 followers 430 following 7 posts

PhD student working on NLP for low-resource, non-standardized language varieties 🍉

Posts Replies Media Videos

Reposted by Arij Riabi

Nathan Godey

@nthngdy.bsky.social

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

November 7, 2025 at 9:11 PM

Reposted by Arij Riabi

Petter Törnberg

@pettertornberg.com

We built the simplest possible social media platform. No algorithms. No ads. Just LLM agents posting and following.

It still became a polarization machine.

Then we tried six interventions to fix social media.

The results were… not what we expected.

arxiv.org/abs/2508.03385

Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation

Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions?...

arxiv.org

August 6, 2025 at 8:24 AM

Reposted by Arij Riabi

Wissam Antoun

@wissamantoun.bsky.social

ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:

April 14, 2025 at 3:41 PM

Reposted by Arij Riabi

Inria Paris NLP (ALMAnaCH team)

@inriaparisnlp.bsky.social

Congratulations to @arijriabi.bsky.social who successfully defended her PhD “Small is Beautiful: Addressing Resource Scarcity, Language Variation, & Transfer Challenges for Automatic Detection of Harmful Language” last Tuesday, supervised by @zehavoc.bsky.social & @openlaurent.bsky.social 👩‍🎓🎉

PhD defence of Arij Riabi, 18 March 2025

March 25, 2025 at 10:46 AM

Arij Riabi

@arijriabi.bsky.social

I am excited to share that I have successfully defended my PhD, "Addressing Resource Scarcity, Language Variation, and Transfer Challenges for Automatic Detection of Harmful Language." 🎉
👩‍🎓👩‍🎓🎉
@inriaparisnlp.bsky.social
@sorbonne-universite.fr

March 20, 2025 at 8:45 AM

Reposted by Arij Riabi

Javier Lopetegui

@jlopetegui.bsky.social

🎉 🌍✍️ I'm thrilled to announce that our paper, "Common Ground, Diverse Roots: The Difficulty of Classifying Common Examples in Spanish Varieties", co-authored with @arijriabi.bsky.social and @zehavoc.bsky.social, has been accepted for the #VarDial2025 workshop during #COLING2025! 🎉 1/5

December 27, 2024 at 5:02 PM

Reposted by Arij Riabi

Dr Abeba Birhane

@abeba.bsky.social

most people want a quick and simple answer to why AI systems encode/exacerbate societal and historical bias/injustice and due to the reductive but common thinking of "bias in, bias out," the obvious culprit often is training data but this is not entirely true

1/

November 24, 2024 at 4:26 PM

Reposted by Arij Riabi

Alix Chagué 🌈

@alix-tz.bsky.social

Now that I am on bluesky, let me take you again on a threaded tour of HTR-United (#HTR_United), a project founded and led by @ponteineptique.bsky.social and I since September 2021. Its main goal is to facilitate finding and sharing open datasets to train HTR and OCR models!

htr-united.github.io

HTR-United

HTR-United is a catalog and an ecosystem for sharing and finding ground truth for optical character or handwritten text recognition (OCR/HTR).

htr-united.github.io

October 30, 2023 at 10:48 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news