Jaap Jumelet
banner
jumelet.bsky.social
Jaap Jumelet
@jumelet.bsky.social
Postdoc @rug.nl with Arianna Bisazza.

Interested in NLP, interpretability, syntax, language acquisition and typology.
For more information check out the website, paper, and datasets:

Website: babylm.github.io/babybabellm/
Paper: arxiv.org/pdf/2510.10159

We hope BabyBabelLM will continue as a 'living resource', fostering both more efficient NLP methods, and opening ways for cross-lingual computational linguistics!
BabyBabelLM
babylm.github.io
October 15, 2025 at 10:53 AM
Next to our training resources, we also release an evaluation pipeline that assess different aspects of language learning.

We present results for various simple baseline models, but hope this can serve as a starting point for a multilingual BabyLM challenge in future years!
October 15, 2025 at 10:53 AM
To deal with data imbalances, we divide languages into three Tiers. This better enables cross-lingual studies and makes it possible for low-resource languages to be a part of BabyBabelLM as well.
October 15, 2025 at 10:53 AM
With a fantastic team of international collaborators we have developed a pipeline for creating LM training data from resources that children are exposed to.

We release this pipeline and welcome new contributions!

Website: babylm.github.io/babybabellm/
Paper: arxiv.org/pdf/2510.10159
October 15, 2025 at 10:53 AM
Wij speelden als kind (in Breda) vaak "1 keer tets", waar je een voetbal maximaal 1 keer mocht laten stuiteren; ik had ook geen idee dat dat een Brabants woord was.
September 1, 2025 at 2:33 PM
Congrats and good luck in Canada!
July 1, 2025 at 11:05 PM
Ohh cool! Nice to see the interactions-as-structure idea I had back in 2021 is still being explored!
June 12, 2025 at 10:37 PM
Scherp geschreven en geheel mee eens, maar beetje wrang wel dat de boodschap zich achter een paywall van 450 euro bevindt :') (dank voor de screenshots!)
April 23, 2025 at 11:40 AM
That is definitely possible indeed, and a potential confounding factor. In RuBLiMP, a Russian benchmark, they defined a way to validate this based on LM probs, but we left that open for future work. The poor performance on low-res langs shows they're definitely not trained on all of UD though!
April 17, 2025 at 7:03 PM