Marianne de Heer Kloots
mdhk.net
Marianne de Heer Kloots
@mdhk.net
Linguist in AI & CogSci 🧠👩‍💻🤖 PhD student @ ILLC, University of Amsterdam

🌐 https://mdhk.net/
🐘 https://scholar.social/@mdhk
🐦 https://twitter.com/mariannedhk
Pinned
✨ Do current neural speech models show human-like linguistic biases in speech perception?

We took inspiration from classic phonetic categorization experiments to explore where sensitivity to phonotactic context emerges in Wav2Vec2 models 🔍
(w/ @wzuidema.bsky.social)

📑 arxiv.org/abs/2407.03005

⬇️
Reposted by Marianne de Heer Kloots
New book! I have written a book, called Syntax: A cognitive approach, published by MIT Press.

This is open access; MIT Press will post a link soon, but until then, the book is available on my website:
tedlab.mit.edu/tedlab_websi...
tedlab.mit.edu
December 24, 2025 at 7:55 PM
Reposted by Marianne de Heer Kloots
New post! Last week I shared why I thought cognitive (neuro)science hasn’t contributed as much as one might hope to the design of AI systems; this week I'm sharing my thoughts on how methods and principles from these fields *have* been useful in my work. infinitefaculty.substack.com/p/how-cognit...
How cognitive science can contribute to AI: methods for understanding
#2 in a series on cognitive science and AI
infinitefaculty.substack.com
December 23, 2025 at 5:10 PM
I’m really enjoying the Computational Psycholinguistics Meeting in Utrecht! cpl2025.sites.uu.nl 🧠

Yesterday MSc student Sven Terpstra (co-supervised w/ @wzuidema.bsky.social) presented his project on predicting the N400 with GPT-derived metrics beyond surprisal openreview.net/forum?id=MAl...
Beyond surprisal: GPT-derived attention metrics offer additional...
The N400 component of the EEG signal is a well-established neural correlate of real-time language comprehension, sensitive to a range of lexical and contextual variables. While earlier studies have...
openreview.net
December 19, 2025 at 12:18 PM
'Tis the season to preprint BBS commentaries; I'm happy to share ours too! 🎄✨

The textual basis of current LLMs causes trouble, but linguistically relevant insights *can* be found in systems modelling the more natural form of human spoken language: the speech signal itself. arxiv.org/abs/2512.14506
December 17, 2025 at 3:22 PM
Reposted by Marianne de Heer Kloots
Why do humans have linguistic intuition? And why should you care?

A short thread about my new paper in @cadlin.bsky.social

This work has the most original insight I've ever had, a genuinely new idea about the nature of language

cadernos.abralin.org/index.php/ca...

1/20
Why Do Humans Have Linguistic Intuition? | Cadernos de Linguística
cadernos.abralin.org
December 15, 2025 at 4:14 PM
Reposted by Marianne de Heer Kloots
Many studies of naturalistic comprehension report that surprisal (often LLM derived) explains more of the variance in data than other predictors. Why is this? And why can it be problematic for our conclusions?

A 🧵 of takeaways from our paper doi.org/10.1007/s421... with @andreaeyleen.bsky.social
What’s Surprising About Surprisal - Computational Brain & Behavior
In the computational and experimental psycholinguistic literature, the mechanisms behind syntactic structure building (e.g., combining words into phrases and sentences) are the subject of considerable...
doi.org
November 17, 2025 at 5:13 PM
Reposted by Marianne de Heer Kloots
Why isn’t modern AI built around principles from cognitive science or neuroscience? Starting a substack (infinitefaculty.substack.com/p/why-isnt-m...) by writing down my thoughts on that question: as part of a first series of posts giving my current thoughts on the relation between these fields. 1/3
Why isn’t modern AI built around principles from cognitive science?
First post in a series on cognitive science and AI
infinitefaculty.substack.com
December 16, 2025 at 3:40 PM
I had a wonderful time getting to know @gronlp.bsky.social last week while discussing linguistic structure and learning trajectories in speech models! ✨ Many thanks for the invite @frap98.bsky.social, already looking forward to catching up again soon :)
Last week I had the pleasure of hosting a fantastic friend and researcher, @mdhk.net , who came to visit us in Groningen for a couple of days from Amsterdam! 🎉
December 1, 2025 at 5:13 PM
Reposted by Marianne de Heer Kloots
🚨NEW PUBLICATION ALERT!🚨
The 'Design Features' of Language Revisited (w/ @mperlman.bsky.social @glupyan.bsky.social Koen de Reus & @limorraviv.bsky.social)
Feature Review out now in #OpenAccess in @cp-trendscognsci.bsky.social! #language #linguistics
Paper: doi.org/10.1016/j.ti...
November 25, 2025 at 7:49 PM
Reposted by Marianne de Heer Kloots
Interested in the evolution of human language? Check out our new paper in @science.org where we synthesize latest findings and outline a multifaceted, bio-cultural approach for studying how language evolved. Super proud of this work, and hoping it leads to exciting new research! tinyurl.com/ykacvanp
November 21, 2025 at 9:47 AM
Reposted by Marianne de Heer Kloots
The Multilingual Minds & Machines Meetings call for abstracts is now open! Everything you need to know is here -> mmmm2026.github.io
November 18, 2025 at 10:05 AM
Reposted by Marianne de Heer Kloots
happy to share our new paper, out now in Neuron! led by the incredible Yizhen Zhang, we explore how the brain segments continuous speech into word-forms and uses adaptive dynamics to code for relative time - www.sciencedirect.com/science/arti...
Human cortical dynamics of auditory word form encoding
We perceive continuous speech as a series of discrete words, despite the lack of clear acoustic boundaries. The superior temporal gyrus (STG) encodes …
www.sciencedirect.com
November 7, 2025 at 6:16 PM
Reposted by Marianne de Heer Kloots
Delighted to share our new paper, now out in PNAS! www.pnas.org/doi/10.1073/...

"Hierarchical dynamic coding coordinates speech comprehension in the brain"

with dream team @alecmarantz.bsky.social, @davidpoeppel.bsky.social, @jeanremiking.bsky.social

Summary 👇

1/8
PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
www.pnas.org
October 22, 2025 at 5:21 AM
Reposted by Marianne de Heer Kloots
🌍Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data!

LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data

We extend this effort to 45 new languages!
October 15, 2025 at 10:53 AM
Reposted by Marianne de Heer Kloots
Interesting paper suggesting a mechanism for why in-context learning happens in LLMs.

They show that LLMs implicitly apply an internal low-rank weight update adjusted by the context. It’s cheap (due to the low-rank) but effective for adapting the model’s behavior.

#MLSky

arxiv.org/abs/2507.16003
Learning without training: The implicit dynamics of in-context learning
One of the most striking features of Large Language Models (LLM) is their ability to learn in context. Namely at inference time an LLM is able to learn new patterns without any additional weight updat...
arxiv.org
October 6, 2025 at 1:30 PM
Reposted by Marianne de Heer Kloots
PhD Position: Accented Speech Processing - Apply now!

Come work with Mirjam Broersma, @davidpeeters.bsky.social, and me at the Centre for Language Studies, Radboud University in the Netherlands.

Application deadline: 19 October 2025

For more information, see
www.ru.nl/en/working-a...
PhD Position: Accented Speech Processing | Radboud University
Do you want to work as a PhD: Accented Speech Processing at the Faculty of Arts? Check our vacancy!
www.ru.nl
October 2, 2025 at 2:35 PM
Huge congrats to the envisionBOX team for the Open Science award nomination! 🎉

My tutorial on speech analysis tools in Python from the Unboxing Multimodality summer school (github.com/mdhk/unboxin...) is now also available at envisionbox.org

Thanks for the invitation & this great initiative! 👏
October 2, 2025 at 5:18 PM
Reposted by Marianne de Heer Kloots
The 𝗜𝗟𝗖𝗕 𝗦𝘂𝗺𝗺𝗲𝗿 𝗦𝗰𝗵𝗼𝗼𝗹 in Marseille went beyond all my expectations! 💯

A week has already flown by since I had one of the most formative experiences of my PhD so far. 👩‍🎨
September 12, 2025 at 9:52 AM
✨ Do self-supervised speech models learn to encode language-specific linguistic features from their training data, or only more language-general acoustic correlates?

At #Interspeech2025 we presented our new Wav2Vec2-NL model and SSL-NL evaluation dataset to test this!

📄 arxiv.org/abs/2506.00981

⬇️
August 27, 2025 at 2:31 PM
Had such a great time presenting our tutorial on Interpretability Techniques for Speech Models at #Interspeech2025! 🔍

For anyone looking for an introduction to the topic, we've now uploaded all materials to the website: interpretingdl.github.io/speech-inter...
August 19, 2025 at 9:23 PM
Reposted by Marianne de Heer Kloots
Humans largely learn language through speech. In contrast, most LLMs learn from pre-tokenized text.

In our #Interspeech2025 paper, we introduce AuriStream: a simple, causal model that learns phoneme, word & semantic information from speech.

Poster P6, tomorrow (Aug 19) at 1:30 pm, Foyer 2.2!
August 19, 2025 at 1:12 AM
Reposted by Marianne de Heer Kloots
What a privilege to have #CCN2025 in (an exceptionally warm and sunny) Amsterdam this year!

It was my first time attending the conference, and being surrounded by so many talented researchers whose interests are similar to mine has been a deeply enriching experience ✨
August 17, 2025 at 1:46 PM
Huge congrats to @maithevannoort.bsky.social on her very popular poster! 🎉 She is now also on bluesky (and looking for a PhD position 👀)
MSc student Maithe van Noort will present her project (co-supervised with @mheilbron.bsky.social) on Compositional Meaning in Vision Language Models and the Brain, testing the waters with new fMRI data of the human brain on Winoground! (poster B26)
🔗 2025.ccneuro.org/poster/?id=1...
August 13, 2025 at 4:23 PM
So exciting, #CCN2025 in Amsterdam started today! We have stroopwafels!!

Catch me at my poster on Friday to chat about the role of context in neural representational alignment to spoken language systems (C34) 🙌



🔗 2025.ccneuro.org/poster/?id=K...
August 12, 2025 at 2:19 PM
I’m in hall X5 at board 3! See you there 🙌
Next week I’ll be in Vienna for my first *ACL conference! 🇦🇹✨

I will present our new BLiMP-NL dataset for evaluating language models on Dutch syntactic minimal pairs and human acceptability judgments ⬇️

🗓️ Tuesday, July 29th, 16:00-17:30, Hall X4 / X5 (Austria Center Vienna)
July 29, 2025 at 2:02 PM