CLAUSE - Computational Linguistics @ Bielefeld University
banner
clausebielefeld.bsky.social
CLAUSE - Computational Linguistics @ Bielefeld University
@clausebielefeld.bsky.social
CompLing group (CLAUSE) at Bielefeld U (PI: Sina Zarrieß). We work on: NLG, Language & Vision, Pragmatics & Dialogue, HateSpeech, BabyLMs, DH, and more!

clause-bielefeld.github.io
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
For years since the GPT-2 paper, emergent in-context learning (ICL) from 'next-token' training has been treated as something deeply tied to 𝐡𝐮𝐦𝐚𝐧 𝐥𝐚𝐧𝐠𝐮𝐚𝐠𝐞. But … is it?
November 18, 2025 at 5:27 PM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Am I evil? Am I likeable?

Need a 10 minutes break? Like Fantasy? Loath it? Take part in our study and help us by rating images of fictional characters here:
bixprag.lili.uni-bielefeld.de/publix/0aSWK...
November 19, 2025 at 10:25 AM
For this week’s group colloquium, we invited Loulou Kosmala from Paris-Est Créteil University. She gave a talk on multimodal feedback during all types of conversation, from real life to virtual, from learners to adults, from L1 to L2, and more! 🤩
November 11, 2025 at 10:44 AM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
As part of this year's BabyLM challenge, we (researchers from @gronlp.bsky.social and @clausebielefeld.bsky.social diverged from established pretraining paradigm by training only on dialogue data from CHILDES.
October 28, 2025 at 12:53 PM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Preprint alert! We release BabyBabelLM, a multilingual benchmark of developmentally plausible training data. I was responsible for German and Polish data as well as various child-directed wikis. Immensely rewarding project with exceptionally cool co-authors. 🥳🚀
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪

Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉

arxiv.org/abs/2510.10159
October 14, 2025 at 5:19 PM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪

Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉

arxiv.org/abs/2510.10159
October 14, 2025 at 5:01 PM
Happening in an hour! 🥳
If you are at #IWCS, then you should not miss Sanne‘s talk ”Not Just Who or What: Modeling the Interaction of Linguistic and Annotator Variation in Hateful Word Interpretation“ (Sanne Hoeken, Özge Alacam, Dong Nguyen, Massimo Poesio, Sina Zarrieß), tomorrow at 16:30! 🕟
@sannehoeken.bsky.social
September 23, 2025 at 1:36 PM
If you are at #IWCS, then you should not miss Sanne‘s talk ”Not Just Who or What: Modeling the Interaction of Linguistic and Annotator Variation in Hateful Word Interpretation“ (Sanne Hoeken, Özge Alacam, Dong Nguyen, Massimo Poesio, Sina Zarrieß), tomorrow at 16:30! 🕟
@sannehoeken.bsky.social
September 22, 2025 at 10:15 AM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Sina Zarieß is giving the KONVENS keynote on training BabyLMs #nlproc
The slide shows the number of words a 12yo human has seen in their lifetime compared to the numbers of words typical language models have seen in training #llm
September 11, 2025 at 11:46 AM
Happening now: Sina‘s keynote on our BabyLM work. 🥳
September 11, 2025 at 11:34 AM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Great first day at #KONVENS2015 today. Looking forward to another engaging day with a keynote by Sina Zarrieß tomorrow 🤓
@clausebielefeld.bsky.social
September 10, 2025 at 8:36 PM
Don’t miss Sina‘s keynote on BabyLMs at #konvens tomorrow!
From conference to conference — after last week’s #semdial I am at #konvens in Hildesheim this week. I will be presenting out German BabyLM Corpus (with @simphon.bsky.social) and our PI Sina Zarrieß will give a Keynote on BabyLMs tomorrow. 🥳
September 10, 2025 at 11:09 AM
Final Keynote of #semdial by David Schlangen on ”Meaningful Interaction with Unreal Speakers?“ 😇💬
September 5, 2025 at 9:32 AM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Final day at #semdial2025 #bialogue — four more presentations, one key note and hopefully many engaging discussions. Let's go!
September 5, 2025 at 6:11 AM
Second #semdial keynote by Robert Hawkins on ”Foraging for common ground“
September 4, 2025 at 2:03 PM
Day 2 of #semdial starts with a session on LMs and dialogue systems 🤩
September 4, 2025 at 6:40 AM
And the second talk features contributions by our PI Sina Zarrieß. 🤩
Now coming up: session 1 on naturalistic dialogue 👌
September 3, 2025 at 8:35 AM
#semdial has begun 💬
First keynote by Arabella Sinclair from the University of Aberdeen on “The many reasons for repetition in Dialogue”.
September 3, 2025 at 7:33 AM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
#semdial is about to begin 🥳
September 3, 2025 at 7:01 AM
#semdial2025, the long-awaited #bialogue conference starts tomorrow! We are looking forward to three wonderful conference days, featuring three exciting keynotes, and many oral and poster presentations on the semantics and pragmatics of dialogue. 👄💬
Check out the program and proceedings below. 👇
September 2, 2025 at 8:10 PM
Reposted by CLAUSE - Computational Linguistics @ Bielefeld University
Is simpler child-directed language easier to learn?

Check out our CoNLL paper "Do Construction Distributions Shape Formal Language Learning in German BabyLMs?"

@conll-conf.bsky.social
August 1, 2025 at 9:24 AM
Our PI Sina will give an oral presentation on "Components of Creativity: Language Model-based Predictors for Clustering and Switching in Verbal Fluency" at @conll-conf.bsky.social in 45 minutes. Come check it out if you are at @aclmeeting.bsky.social #ACL2025NLP
August 1, 2025 at 9:13 AM
Impromptu dinner after @conll-conf.bsky.social #ACL2025NLP, connecting Bielefeld and the Netherlands over Greek food 😇👌
July 31, 2025 at 5:17 PM
Happening now: catch Simeon, Manar and Larissa presenting their paper -Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks?- in hall X5. #ACL2025NLP
July 28, 2025 at 4:00 PM
Happening now — Clara, Judith and Sina present their poster:
Can LLMs Ground when they (Don’t) Know: A Study on Direct and Loaded Political Questions
(Poster board 45) #ACL2025NLP
July 28, 2025 at 8:45 AM