Bastian Bunzeck
banner
bbunzeck.bsky.social
Bastian Bunzeck
@bbunzeck.bsky.social
Computational linguist trying to understand how humans and computers learn and use language 👶🧠🗣️🖥️💬

The work is mysterious and important. See https://bbunzeck.github.io

PhDing at @clausebielefeld.bsky.social
Reposted by Bastian Bunzeck
google: we have invented agi but have hidden it in such an obscure website no one will ever find it
anthropic: through our interpretability research, we discovered claude imagines himself wearing a bow tie at all times
openai: we added slot machines
December 29, 2025 at 7:49 PM
Reposted by Bastian Bunzeck
A fascinating recent development is that the ML research community -- as the earliest adopters of "AI for research" -- are at the frontlines of dealing with all the problems that come with that (ie. reduced trust in results & reviewers, increased submission load etc).

Every other field is next! 😭
We need new rules for publishing AI-generated research. The teams developing automated AI scientists have customarily submitted their papers to standard refereed venues (journals and conferences) and to arXiv. Often, acceptance has been treated as the dependent variable. 1/
December 27, 2025 at 7:46 PM
Look what Santa has slipped unter my virtual Christmas tree🎄🤩
New book! I have written a book, called Syntax: A cognitive approach, published by MIT Press.

This is open access; MIT Press will post a link soon, but until then, the book is available on my website:
tedlab.mit.edu/tedlab_websi...
tedlab.mit.edu
December 24, 2025 at 10:35 PM
Reposted by Bastian Bunzeck
Very happy to announce that my alma mater Bielefeld University (Germany) now offers an international linguistics master's program, 100% taught in English!

Here's a link with more information: linguistlist.org/issues/36/38...
LINGUIST List 36.3881 FYI: International Master’s Program Linguistics, Bielefeld University
The LINGUIST List, International Linguistics Community Online.
linguistlist.org
December 20, 2025 at 6:08 AM
Reposted by Bastian Bunzeck
1/ 🌍 How does mixing data from hundreds of languages affect LLM training?
In our new paper "Revisiting Multilingual Data Mixtures in Language Model Pretraining" we revisit core assumptions about multilinguality using 1.1B-3B models trained on up to 400 languages.
🧵👇
December 15, 2025 at 6:18 PM
Reposted by Bastian Bunzeck
All research is exploratory if you’re confused enough
December 19, 2025 at 8:01 AM
Reposted by Bastian Bunzeck
🏹 Job alert: Two fully funded PhD positions in Natural Language Processing at University of Leipzig

📍 Leipzig 🇩🇪
📅 Apply by Jan 15th
🔗 https://ellis.eu/research/jobs/2025-12-16-two-fully-funded-phd-positions-in-natural-language-processing
Two fully funded PhD positions in Natural Language Processing at University of Leipzig | European Laboratory for Learning and Intelligent Systems
ellis.eu
December 18, 2025 at 7:05 AM
Reposted by Bastian Bunzeck
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵
December 15, 2025 at 5:19 PM
Reposted by Bastian Bunzeck
And they want to take TikTok away from kids.
December 14, 2025 at 2:25 AM
Reposted by Bastian Bunzeck
🧑‍🔬I’m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but aren’t limited to:

🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology

Please share!

#NLProc #NLP
December 11, 2025 at 1:36 PM
Reposted by Bastian Bunzeck
🥳Life Update!

I’m thrilled to share that I’ll be starting as assistant professor for Natural Language Processing @unileipzig.bsky.social in April! I’m deeply grateful to everyone who supported me on this journey.

I will be recruiting PhD students with @scadsai.bsky.social, stay tuned for details!
December 10, 2025 at 1:10 PM
Reposted by Bastian Bunzeck
A couple years (!) in the making: we’re releasing a new corpus of embodied, collaborative problem solving dialogues. We paid 36 people to play Portal 2’s co-op mode and collected their speech + game recordings.

Paper: arxiv.org/abs/2512.03381
Website: berkeley-nlp.github.io/portal-dialo...

1/n
December 5, 2025 at 6:54 PM
Reposted by Bastian Bunzeck
New Cambridge Element, Creative Construction Grammar, by Thomas Hoffmann and Mark Turner, out now! Read Open Access at
https://cup.org/4pkd1np
#languageandlinguistics #LangSky
December 6, 2025 at 9:00 AM
Reposted by Bastian Bunzeck
Finally out & open access: Hoffmann & Turner on Creative Construction Grammar. How do we communicate complex meanings? How do we combine words into sentences?

#creativity #language #linguistics #Construction Grammar

Find out at

www.cambridge.org/core/element...
Creative Construction Grammar
Cambridge Core - Cognitive Linguistics - Creative Construction Grammar
www.cambridge.org
December 5, 2025 at 9:26 AM
Reposted by Bastian Bunzeck
Check out this exiting issue with papers from various flavors of Construction Grammar describing English constructions! www.degruyterbrill.com/journal/key/...
Special Issue: Describing English Constructions; Issue Editors: Barthe Bloom and Thomas Herbst
Volume 73, issue 3 of the journal Zeitschrift für Anglistik und Amerikanistik was published in 2025.
www.degruyterbrill.com
December 4, 2025 at 7:39 AM
Reposted by Bastian Bunzeck
Transformers v5 Release Candidate

"This is the first major release in five years where 800 commits have been pushed to main since the latest minor release. This release introduces several refactors that significantly simplify our APIs and internals, and comes with a large number of bug fixes."
December 2, 2025 at 12:06 AM
Reposted by Bastian Bunzeck
Last week I had the pleasure of hosting a fantastic friend and researcher, @mdhk.net , who came to visit us in Groningen for a couple of days from Amsterdam! 🎉
December 1, 2025 at 3:33 PM
Reposted by Bastian Bunzeck
We are hiring a doctoral candidate to work in our project that will create a diachronic corpus of Northern Mansi. It is a 65% position (E13 TV-L) for 2 years 9 months starting in spring 2026. Please spread the word! www.finnougristik.uni-muenchen.de/forschung/fo...
PhD position in a project on Northern Mansi - Lehrstuhl für Finnougristik - LMU München
www.finnougristik.uni-muenchen.de
December 1, 2025 at 9:02 AM
Reposted by Bastian Bunzeck
I’m excited to present SimpleStories at EurIPS!

Also if anyone at #EurIPS is interested in chatting about LLM data efficiency, interpretability, model inconsistency or other topics feel free to DM me.

Dataset and models: lnkd.in/e_VGWqhP
Code: lnkd.in/eEidmv74
Paper: lnkd.in/eH6jS9uY
December 1, 2025 at 3:41 AM
Reposted by Bastian Bunzeck
Postdoc position in Stuttgart, Germany (TV-L 13, 100%) for 18 months, on authority presuppositions in AI systems with Dr. Agnieszka Faleńska and me. For more information and application info, see here: safety.https://www.ims.uni-stuttgart.de/documents/team/falensaa/aphic_postoc.pdf
www.ims.uni-stuttgart.de
November 24, 2025 at 8:02 AM
Reposted by Bastian Bunzeck
🚨NEW PUBLICATION ALERT!🚨
The 'Design Features' of Language Revisited (w/ @mperlman.bsky.social @glupyan.bsky.social Koen de Reus & @limorraviv.bsky.social)
Feature Review out now in #OpenAccess in @cp-trendscognsci.bsky.social! #language #linguistics
Paper: doi.org/10.1016/j.ti...
November 25, 2025 at 7:49 PM
Reposted by Bastian Bunzeck
We are advertising **11 new PhD positions** in the second cohort of our RTG on Curiosity (details on all 11 positions here: www.uni-goettingen.de/de/open+posi...). One of these positions is in my group looking at the role of curiosity in early word learning (www.uni-goettingen.de/en/644546.ht...)
Open Positions - Georg-August-Universität Göttingen
Webseiten der Georg-August-Universität Göttingen
www.uni-goettingen.de
November 25, 2025 at 1:32 PM
Reposted by Bastian Bunzeck
🚀 Introducing TMLR Beyond PDF!

🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.

🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
November 25, 2025 at 4:12 PM
Reposted by Bastian Bunzeck
We began day 2 of our Large Language Models (LLM) for linguistics research workshop @UniKoeln with a fascinating keynote by Charlotte Pouw on "Interpreting models for speech generation and understanding using methods from #psycholinguistics". Charlotte shared […]

[Original post on fediscience.org]
November 25, 2025 at 8:54 AM
Reposted by Bastian Bunzeck
Tomorrow, we will show within the science festival #geniale how AI voice modification can help explaining the subtle differences between voice qualities, expressing personality, age, mood, gender, health, and much more! #trr318, #bielefeld #tts #xAI wissenswerkstadt.de/veranstaltun...
Sag was! | Wissenswerkstadt Bielefeld
Mit Hilfe von KI Stimmeigenschaften erklärbarer machen
wissenswerkstadt.de
November 21, 2025 at 1:15 PM