Francois Meyer
banner
francois-meyer.bsky.social
Francois Meyer
@francois-meyer.bsky.social
PhD student at the University of Cape Town, working on text generation for low-resource, morphologically complex languages.
https://francois-meyer.github.io/

Cape Town, South Africa
Reposted by Francois Meyer
🌍Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data!

LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data

We extend this effort to 45 new languages!
October 15, 2025 at 10:53 AM
Reposted by Francois Meyer
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪

Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉

arxiv.org/abs/2510.10159
October 14, 2025 at 5:01 PM
Today our poster will be up at @loreslm.bsky.social Poster Session #2 (2-3pm local time Abu Dhabi).

It's also available online at Whova: whova.com/portal/webap...
Our paper "BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context" will be presented at The First Workshop on Language Models for Low-Resource Languages at #COLING2025 in Abu Dhabi.

Paper: arxiv.org/pdf/2501.03855
arxiv.org
January 20, 2025 at 6:43 AM
Our paper "BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context" will be presented at The First Workshop on Language Models for Low-Resource Languages at #COLING2025 in Abu Dhabi.

Paper: arxiv.org/pdf/2501.03855
arxiv.org
January 14, 2025 at 7:09 AM