Francesca Padovani
@frap98.bsky.social
2nd year PhD Student at @gronlp.bsky.social 🐮 - University of Groningen
Language Acquisition - NLP
Language Acquisition - NLP
Reposted by Francesca Padovani
Interested in developmentally plausible LMs, and the role of child-directed language data?
Come to our poster TODAY (Fr 7 Nov, 10.30-12.00) #EMNLP!
Come to our poster TODAY (Fr 7 Nov, 10.30-12.00) #EMNLP!
“Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models”
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models
Seminal work by Huebner et al. (2021) showed that language models (LMs) trained on English Child-Directed Language (CDL) can reach similar syntactic abilities as LMs trained on much larger amounts of ...
arxiv.org
November 6, 2025 at 9:13 PM
Interested in developmentally plausible LMs, and the role of child-directed language data?
Come to our poster TODAY (Fr 7 Nov, 10.30-12.00) #EMNLP!
Come to our poster TODAY (Fr 7 Nov, 10.30-12.00) #EMNLP!
Reposted by Francesca Padovani
Thrilled to be heading to Suzhou with a big team of GroNLP'ers 🐮
Interested in Interpretable, Cognitively inspired, Low-resource LMs? Don't miss our posters & talks #EMNLP2025!
Interested in Interpretable, Cognitively inspired, Low-resource LMs? Don't miss our posters & talks #EMNLP2025!
With only a week left for #EMNLP2025, we are happy to announce all the works we 🐮 will present 🥳 - come and say "hi" to our posters and presentations during the Main and the co-located events (*SEM and workshops) See you in Suzhou ✈️
October 31, 2025 at 10:50 PM
Thrilled to be heading to Suzhou with a big team of GroNLP'ers 🐮
Interested in Interpretable, Cognitively inspired, Low-resource LMs? Don't miss our posters & talks #EMNLP2025!
Interested in Interpretable, Cognitively inspired, Low-resource LMs? Don't miss our posters & talks #EMNLP2025!
Reposted by Francesca Padovani
...ii) a direct quality reward from a teacher model, and iii) a reward based on the log probabilities of a teacher model (and its dialogue continuations). While these rewards did not improve our models performance, two different DPO approaches did!
October 28, 2025 at 12:55 PM
...ii) a direct quality reward from a teacher model, and iii) a reward based on the log probabilities of a teacher model (and its dialogue continuations). While these rewards did not improve our models performance, two different DPO approaches did!
Reposted by Francesca Padovani
As part of this year's BabyLM challenge, we (researchers from @gronlp.bsky.social and @clausebielefeld.bsky.social diverged from established pretraining paradigm by training only on dialogue data from CHILDES.
October 28, 2025 at 12:53 PM
As part of this year's BabyLM challenge, we (researchers from @gronlp.bsky.social and @clausebielefeld.bsky.social diverged from established pretraining paradigm by training only on dialogue data from CHILDES.
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪
Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉
arxiv.org/abs/2510.10159
Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉
arxiv.org/abs/2510.10159
October 14, 2025 at 5:01 PM
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪
Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉
arxiv.org/abs/2510.10159
Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉
arxiv.org/abs/2510.10159
Reposted by Francesca Padovani
Computational Psycholinguistics Meeting 2025
cpl2025.sites.uu.nl
When: December 18–19, 2025
Where: Utrecht, the Netherlands
Abstract submission deadline: June 15, 2025
Organizers: Jakub Dotlačil, Lena Jäger, Bruno Nicenboim, Ece Takmaz
cpl2025.sites.uu.nl
When: December 18–19, 2025
Where: Utrecht, the Netherlands
Abstract submission deadline: June 15, 2025
Organizers: Jakub Dotlačil, Lena Jäger, Bruno Nicenboim, Ece Takmaz
Computational Psycholinguistics Meeting 2025 | Universiteit Utrecht
Universiteit Utrecht
cpl2025.sites.uu.nl
March 20, 2025 at 7:25 PM
Computational Psycholinguistics Meeting 2025
cpl2025.sites.uu.nl
When: December 18–19, 2025
Where: Utrecht, the Netherlands
Abstract submission deadline: June 15, 2025
Organizers: Jakub Dotlačil, Lena Jäger, Bruno Nicenboim, Ece Takmaz
cpl2025.sites.uu.nl
When: December 18–19, 2025
Where: Utrecht, the Netherlands
Abstract submission deadline: June 15, 2025
Organizers: Jakub Dotlačil, Lena Jäger, Bruno Nicenboim, Ece Takmaz
Reposted by Francesca Padovani
Jane Goodall (1934-2025) 🤍
October 1, 2025 at 8:02 PM
Jane Goodall (1934-2025) 🤍
Reposted by Francesca Padovani
My very first book review is out now 📚
Muchas gracias to @stefanhartmann.bsky.social for inviting me, looking forward to our next project(s) 😇
Muchas gracias to @stefanhartmann.bsky.social for inviting me, looking forward to our next project(s) 😇
It's been a while since I've written a book review - here's our review of Herbst & Hoffmann (2024), my first but definitely not last collaboration with the brilliant @bbunzeck.bsky.social doi.org/10.1017/S136... ($)
Thomas Herbst and Thomas Hoffmann, A Construction Grammar of the English language: CASA – a constructionist approach to syntactic analysis (Cognitive Linguistics in Practice 5). Amsterdam and Philadel...
Thomas Herbst and Thomas Hoffmann, A Construction Grammar of the English language: CASA – a constructionist approach to syntactic analysis (Cognitive Linguistics in Practice 5). Amsterdam and Philadel...
doi.org
September 26, 2025 at 9:43 AM
My very first book review is out now 📚
Muchas gracias to @stefanhartmann.bsky.social for inviting me, looking forward to our next project(s) 😇
Muchas gracias to @stefanhartmann.bsky.social for inviting me, looking forward to our next project(s) 😇
Reposted by Francesca Padovani
𝐓𝐡𝐞 𝐈𝐍𝐂𝐑𝐄𝐂 𝐩𝐫𝐨𝐣𝐞𝐜𝐭 (𝐄𝐑𝐂 𝐂𝐨𝐆, 𝐏𝐈 𝐀𝐧𝐚 𝐆𝐮𝐞𝐫𝐛𝐞𝐫𝐨𝐟) 𝐢𝐬 𝐥𝐨𝐨𝐤𝐢𝐧𝐠 𝐟𝐨𝐫 𝐚 𝐧𝐞𝐰 𝐭𝐞𝐚𝐦 𝐦𝐞𝐦𝐛𝐞𝐫! 🎉
We have an opening for a 𝐏𝐨𝐬𝐭𝐝𝐨𝐜 𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧 in 𝐂𝐫𝐞𝐚𝐭𝐢𝐯𝐢𝐭𝐲, 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐓𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 for a starting duration of 12 months extendible to 30 months.
For further details -> www.rug.nl/about-ug/wor...
We have an opening for a 𝐏𝐨𝐬𝐭𝐝𝐨𝐜 𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧 in 𝐂𝐫𝐞𝐚𝐭𝐢𝐯𝐢𝐭𝐲, 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐓𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 for a starting duration of 12 months extendible to 30 months.
For further details -> www.rug.nl/about-ug/wor...
Vacatures bij de RUG
www.rug.nl
September 30, 2025 at 9:22 PM
𝐓𝐡𝐞 𝐈𝐍𝐂𝐑𝐄𝐂 𝐩𝐫𝐨𝐣𝐞𝐜𝐭 (𝐄𝐑𝐂 𝐂𝐨𝐆, 𝐏𝐈 𝐀𝐧𝐚 𝐆𝐮𝐞𝐫𝐛𝐞𝐫𝐨𝐟) 𝐢𝐬 𝐥𝐨𝐨𝐤𝐢𝐧𝐠 𝐟𝐨𝐫 𝐚 𝐧𝐞𝐰 𝐭𝐞𝐚𝐦 𝐦𝐞𝐦𝐛𝐞𝐫! 🎉
We have an opening for a 𝐏𝐨𝐬𝐭𝐝𝐨𝐜 𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧 in 𝐂𝐫𝐞𝐚𝐭𝐢𝐯𝐢𝐭𝐲, 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐓𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 for a starting duration of 12 months extendible to 30 months.
For further details -> www.rug.nl/about-ug/wor...
We have an opening for a 𝐏𝐨𝐬𝐭𝐝𝐨𝐜 𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧 in 𝐂𝐫𝐞𝐚𝐭𝐢𝐯𝐢𝐭𝐲, 𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐓𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 for a starting duration of 12 months extendible to 30 months.
For further details -> www.rug.nl/about-ug/wor...
Reposted by Francesca Padovani
𝗧𝗵𝗲 𝗘𝘂𝗿𝗼𝗽𝗲𝗮𝗻 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀’ 𝗡𝗶𝗴𝗵𝘁 𝘄𝗮𝘀 𝗮𝗻𝘁𝗶𝗰𝗶𝗽𝗮𝘁𝗲𝗱 𝗯𝘆 𝗮𝗻 𝗲𝗾𝘂𝗮𝗹𝗹𝘆 𝗲𝘅𝗰𝗶𝘁𝗶𝗻𝗴 𝗮𝗳𝘁𝗲𝗿𝗻𝗼𝗼𝗻! 😍
Here are some photos of our GroNLP station at the Forum, where we welcomed high school classes and potential future researchers. 🧑🔬🤖
Quizzes, puzzles, demos, card games, and of course, lots of chocolate! 🍫
Here are some photos of our GroNLP station at the Forum, where we welcomed high school classes and potential future researchers. 🧑🔬🤖
Quizzes, puzzles, demos, card games, and of course, lots of chocolate! 🍫
September 26, 2025 at 1:46 PM
𝗧𝗵𝗲 𝗘𝘂𝗿𝗼𝗽𝗲𝗮𝗻 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀’ 𝗡𝗶𝗴𝗵𝘁 𝘄𝗮𝘀 𝗮𝗻𝘁𝗶𝗰𝗶𝗽𝗮𝘁𝗲𝗱 𝗯𝘆 𝗮𝗻 𝗲𝗾𝘂𝗮𝗹𝗹𝘆 𝗲𝘅𝗰𝗶𝘁𝗶𝗻𝗴 𝗮𝗳𝘁𝗲𝗿𝗻𝗼𝗼𝗻! 😍
Here are some photos of our GroNLP station at the Forum, where we welcomed high school classes and potential future researchers. 🧑🔬🤖
Quizzes, puzzles, demos, card games, and of course, lots of chocolate! 🍫
Here are some photos of our GroNLP station at the Forum, where we welcomed high school classes and potential future researchers. 🧑🔬🤖
Quizzes, puzzles, demos, card games, and of course, lots of chocolate! 🍫
The 𝗜𝗟𝗖𝗕 𝗦𝘂𝗺𝗺𝗲𝗿 𝗦𝗰𝗵𝗼𝗼𝗹 in Marseille went beyond all my expectations! 💯
A week has already flown by since I had one of the most formative experiences of my PhD so far. 👩🎨
A week has already flown by since I had one of the most formative experiences of my PhD so far. 👩🎨
September 12, 2025 at 9:52 AM
The 𝗜𝗟𝗖𝗕 𝗦𝘂𝗺𝗺𝗲𝗿 𝗦𝗰𝗵𝗼𝗼𝗹 in Marseille went beyond all my expectations! 💯
A week has already flown by since I had one of the most formative experiences of my PhD so far. 👩🎨
A week has already flown by since I had one of the most formative experiences of my PhD so far. 👩🎨
Reposted by Francesca Padovani
We recharged our batteries with a wonderful weekend together in the province of Drenthe! 🌳🍃
As per tradition, we kick off the academic year with plenty of sharing, smiles, and sports 💘
We wish everyone a wonderful start of the academic year <3 may it be inspiring and exciting 🎓
As per tradition, we kick off the academic year with plenty of sharing, smiles, and sports 💘
We wish everyone a wonderful start of the academic year <3 may it be inspiring and exciting 🎓
September 10, 2025 at 7:15 PM
We recharged our batteries with a wonderful weekend together in the province of Drenthe! 🌳🍃
As per tradition, we kick off the academic year with plenty of sharing, smiles, and sports 💘
We wish everyone a wonderful start of the academic year <3 may it be inspiring and exciting 🎓
As per tradition, we kick off the academic year with plenty of sharing, smiles, and sports 💘
We wish everyone a wonderful start of the academic year <3 may it be inspiring and exciting 🎓
Reposted by Francesca Padovani
Rowena Garcia is presenting research at #ISB15 showing that #crosslinguistic #priming occurs across #typologically unrelated languages & without overlap of constituent order. The degree of overlap modulated the magnitude of the effect.
#Tagalog #Indonesian #English
@isbilingualism.bsky.social
#Tagalog #Indonesian #English
@isbilingualism.bsky.social
June 11, 2025 at 3:18 PM
Rowena Garcia is presenting research at #ISB15 showing that #crosslinguistic #priming occurs across #typologically unrelated languages & without overlap of constituent order. The degree of overlap modulated the magnitude of the effect.
#Tagalog #Indonesian #English
@isbilingualism.bsky.social
#Tagalog #Indonesian #English
@isbilingualism.bsky.social
Reposted by Francesca Padovani
#ACL2025 has just finished and it was great to meet with old friends and connect with new colleagus. In case you missed our papers/presentations, this was us 🐮👇
@facultyofartsug.bsky.social
@facultyofartsug.bsky.social
August 4, 2025 at 10:43 AM
#ACL2025 has just finished and it was great to meet with old friends and connect with new colleagus. In case you missed our papers/presentations, this was us 🐮👇
@facultyofartsug.bsky.social
@facultyofartsug.bsky.social
Reposted by Francesca Padovani
Fabulous blog about screentime from Gonzales, Golinkoff & Hirsh-Pasek here 1/ childandfamilyblog.com/children-and...
Children & Digital Technology | The Controversies
In this guide we discuss and make practical recommendations for parents on children's use of digital technology.
childandfamilyblog.com
July 2, 2025 at 2:58 PM
Fabulous blog about screentime from Gonzales, Golinkoff & Hirsh-Pasek here 1/ childandfamilyblog.com/children-and...
Reposted by Francesca Padovani
Children are incredible language learning machines. But how do they do it? Our latest paper, just published in TICS, synthesizes decades of evidence to propose four components that must be built into any theory of how children learn language. 1/
www.cell.com/trends/cogni... @mpi-nl.bsky.social
www.cell.com/trends/cogni... @mpi-nl.bsky.social
Constructing language: a framework for explaining acquisition
Explaining how children build a language system is a central goal of research in language
acquisition, with broad implications for language evolution, adult language processing,
and artificial intelli...
www.cell.com
June 27, 2025 at 5:17 AM
Children are incredible language learning machines. But how do they do it? Our latest paper, just published in TICS, synthesizes decades of evidence to propose four components that must be built into any theory of how children learn language. 1/
www.cell.com/trends/cogni... @mpi-nl.bsky.social
www.cell.com/trends/cogni... @mpi-nl.bsky.social
Excited to welcome a new member to the family of language-specific BLiMPs 🎊 this time putting Turkish in the spotlight!
Thanks to the meticulous, linguistically grounded work of the brilliant @ezgibasar.bsky.social. We hope will help push forward research on typologically diverse languages!
Thanks to the meticulous, linguistically grounded work of the brilliant @ezgibasar.bsky.social. We hope will help push forward research on typologically diverse languages!
Proud to introduce TurBLiMP, the 1st benchmark of minimal pairs for free-order, morphologically rich Turkish language!
Pre-print: arxiv.org/abs/2506.13487
Fruit of an almost year-long project by amazing MS student @ezgibasar.bsky.social in collab w/ @frap98.bsky.social and @jumelet.bsky.social
Pre-print: arxiv.org/abs/2506.13487
Fruit of an almost year-long project by amazing MS student @ezgibasar.bsky.social in collab w/ @frap98.bsky.social and @jumelet.bsky.social
TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs
We introduce TurBLiMP, the first Turkish benchmark of linguistic minimal pairs, designed to evaluate the linguistic abilities of monolingual and multilingual language models (LMs). Covering 16 linguis...
arxiv.org
June 20, 2025 at 5:51 AM
Excited to welcome a new member to the family of language-specific BLiMPs 🎊 this time putting Turkish in the spotlight!
Thanks to the meticulous, linguistically grounded work of the brilliant @ezgibasar.bsky.social. We hope will help push forward research on typologically diverse languages!
Thanks to the meticulous, linguistically grounded work of the brilliant @ezgibasar.bsky.social. We hope will help push forward research on typologically diverse languages!
“Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models”
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models
Seminal work by Huebner et al. (2021) showed that language models (LMs) trained on English Child-Directed Language (CDL) can reach similar syntactic abilities as LMs trained on much larger amounts of ...
arxiv.org
May 30, 2025 at 7:40 AM
“Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models”
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
I’m happy to share that the preprint of my first PhD project is now online!
🎊 Paper: arxiv.org/abs/2505.23689
Reposted by Francesca Padovani
Few days back 🍦 celebrating spring, sunshine, and one of our most cherished lab member! Nothing like ice cream and an indulgent break to mark a birthday the Computational Linguistics way ☀️🎉 #GroNLP #ComputationalLinguistics #SpringVibes
May 14, 2025 at 6:35 PM
Few days back 🍦 celebrating spring, sunshine, and one of our most cherished lab member! Nothing like ice cream and an indulgent break to mark a birthday the Computational Linguistics way ☀️🎉 #GroNLP #ComputationalLinguistics #SpringVibes
Reposted by Francesca Padovani
🚀 #ILCB #SummerSchool 2025 – Apply Now! 🌞🧠
Join us Sept 1-5, 2025 at CIRM, Marseille to explore Language, Cognition & the Brain!
🗣 Language & Cognition | 🧠 Neuroscience
🤖 AI & Machine Learning | 📊 Math & Stats
📅 Apply by May 23 → www.ilcb.fr/2025-2/
Know someone interested? Tag & share! 📢
Join us Sept 1-5, 2025 at CIRM, Marseille to explore Language, Cognition & the Brain!
🗣 Language & Cognition | 🧠 Neuroscience
🤖 AI & Machine Learning | 📊 Math & Stats
📅 Apply by May 23 → www.ilcb.fr/2025-2/
Know someone interested? Tag & share! 📢
The 8th edition of the Institute of Language, Communication, and the Brain (ILCB) Summer School 2025 | ILCB
Monday, September 1st – Friday, September 5th, 2025
www.ilcb.fr
March 18, 2025 at 11:27 AM
🚀 #ILCB #SummerSchool 2025 – Apply Now! 🌞🧠
Join us Sept 1-5, 2025 at CIRM, Marseille to explore Language, Cognition & the Brain!
🗣 Language & Cognition | 🧠 Neuroscience
🤖 AI & Machine Learning | 📊 Math & Stats
📅 Apply by May 23 → www.ilcb.fr/2025-2/
Know someone interested? Tag & share! 📢
Join us Sept 1-5, 2025 at CIRM, Marseille to explore Language, Cognition & the Brain!
🗣 Language & Cognition | 🧠 Neuroscience
🤖 AI & Machine Learning | 📊 Math & Stats
📅 Apply by May 23 → www.ilcb.fr/2025-2/
Know someone interested? Tag & share! 📢
Reposted by Francesca Padovani
Tomorrow at lunch, I’m giving a talk: “Discovering Linguistic Abstractions”. I’ll share @abstractionerc.bsky.social research projects (PI: @mariannabolog.bsky.social ) at the Linguistic Lunch @facultyofartsug.bsky.social. Thanks @frap98.bsky.social for the invite!
April 30, 2025 at 12:02 PM
Tomorrow at lunch, I’m giving a talk: “Discovering Linguistic Abstractions”. I’ll share @abstractionerc.bsky.social research projects (PI: @mariannabolog.bsky.social ) at the Linguistic Lunch @facultyofartsug.bsky.social. Thanks @frap98.bsky.social for the invite!
Reposted by Francesca Padovani
Modern LLMs "speak" hundreds of languages... but do they really?
Multilinguality claims are often based on downstream tasks like QA & MT, while *formal* linguistic competence remains hard to gauge in lots of languages
Meet MultiBLiMP!
(joint work w/ @jumelet.bsky.social & @weissweiler.bsky.social)
Multilinguality claims are often based on downstream tasks like QA & MT, while *formal* linguistic competence remains hard to gauge in lots of languages
Meet MultiBLiMP!
(joint work w/ @jumelet.bsky.social & @weissweiler.bsky.social)
✨New paper ✨
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
April 8, 2025 at 12:27 PM
Modern LLMs "speak" hundreds of languages... but do they really?
Multilinguality claims are often based on downstream tasks like QA & MT, while *formal* linguistic competence remains hard to gauge in lots of languages
Meet MultiBLiMP!
(joint work w/ @jumelet.bsky.social & @weissweiler.bsky.social)
Multilinguality claims are often based on downstream tasks like QA & MT, while *formal* linguistic competence remains hard to gauge in lots of languages
Meet MultiBLiMP!
(joint work w/ @jumelet.bsky.social & @weissweiler.bsky.social)
Reposted by Francesca Padovani
✨New paper ✨
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
April 7, 2025 at 2:56 PM
✨New paper ✨
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
Introducing 🌍MultiBLiMP 1.0: A Massively Multilingual Benchmark of Minimal Pairs for Subject-Verb Agreement, covering 101 languages!
We present over 125,000 minimal pairs and evaluate 17 LLMs, finding that support is still lacking for many languages.
🧵⬇️
Small delay, but critical not to overlook the importance of it. ✊🏻
On March 18th, the academic community in Groningen took to the streets to protest the Dutch government's crazy direction to substantially cut funding to university and higher education. ✂️
On March 18th, the academic community in Groningen took to the streets to protest the Dutch government's crazy direction to substantially cut funding to university and higher education. ✂️
March 29, 2025 at 10:05 AM
Small delay, but critical not to overlook the importance of it. ✊🏻
On March 18th, the academic community in Groningen took to the streets to protest the Dutch government's crazy direction to substantially cut funding to university and higher education. ✂️
On March 18th, the academic community in Groningen took to the streets to protest the Dutch government's crazy direction to substantially cut funding to university and higher education. ✂️
Reposted by Francesca Padovani
We just released the first version of our German BabyLM corpus: huggingface.co/datasets/bbu...
It contains 16.5M words of (more or less) developmentally plausible, child-directed, child-produced or at least child-available speech 🤓
It contains 16.5M words of (more or less) developmentally plausible, child-directed, child-produced or at least child-available speech 🤓
bbunzeck/babylm-german · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
March 17, 2025 at 10:43 AM
We just released the first version of our German BabyLM corpus: huggingface.co/datasets/bbu...
It contains 16.5M words of (more or less) developmentally plausible, child-directed, child-produced or at least child-available speech 🤓
It contains 16.5M words of (more or less) developmentally plausible, child-directed, child-produced or at least child-available speech 🤓