Wissam Antoun
wissamantoun.bsky.social
Wissam Antoun
@wissamantoun.bsky.social
PhD at ALMAnaCH/Inria Paris,
@aubmindlab Alumni
Interested in AI, NLP, Video Games

wissamantoun.com
Reposted by Wissam Antoun
I'm proud to share that at @inriaparisnlp.bsky.social we have released Gaperon — a suite of generative language models trained on French, English and code data, the largest of which has 24 billion parameters. Both the models and the code are being published under open licences. Short thread🧵
We are proud to announce that we trained 1.5B, 8B, and 24B generative language models from scratch on 2 to 4 tera-tokens of carefully curated, high-quality data covering French, English and code. We release our models and code under open-source licences. Thread👇
November 12, 2025 at 5:26 PM
Reposted by Wissam Antoun
We are proud to announce that we trained 1.5B, 8B, and 24B generative language models from scratch on 2 to 4 tera-tokens of carefully curated, high-quality data covering French, English and code. We release our models and code under open-source licences. Thread👇
November 12, 2025 at 5:05 PM
Reposted by Wissam Antoun
Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social
November 7, 2025 at 9:11 PM
ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:
April 14, 2025 at 3:41 PM
CamemBERT 2.0: A Smarter French 🇫🇷 Language Model Aged to Perfection 👌

We release a much-needed update for the previous. SOTA French encoder LM.

We introduce two new models CamemBERTa-v2 and CamemBERT-v2, based on the DeBERTaV3 and RoBERTa recipe.

So what's new?

[1/8]
November 15, 2024 at 5:07 PM