NLPnorth
banner
nlpnorth.bsky.social
NLPnorth
@nlpnorth.bsky.social
#NLProc research group @itu.dk (Copenhagen, Denmark)

🔗 nlpnorth.github.io
📄DistaLs: A Comprehensive Collection of Language Distance Measures
👥 Rob van der Goot, Esther Ploeger, @verenablaschke.bsky.social Tanja Samardžic
🔗 aclanthology.org/2025.emnlp-d...
🎯A convenient toolkit for obtaining distance measures across languages
▶️ www.youtube.com/watch?v=SSk9...
November 5, 2025 at 1:17 PM
📄Do Syntactic Categories Help in Developmentally Motivated Curriculum Learning for Language Models?
👥 @arzuburcuguven.bsky.social @annarogers.bsky.social Rob van der Goot
🔗 aclanthology.org/2025.babylm-...
🎯We examine syntactic structures in language development and their effect on LM training
November 5, 2025 at 1:17 PM
📄Code Like Humans: A Multi-Agent Solution for Medical Coding
👥 Andreas Geert Motzfeldt, Joakim Edin, Casper Christensen @chrha.bsky.social Lars Maaløe @annarogers.bsky.social
🔗 aclanthology.org/2025.finding...
🎯Agentic LLM framework encoding official coding guidelines to map clinical notes to ICD-10
November 5, 2025 at 1:17 PM
📄The AI Gap: How Socioeconomic Status Affects Language Technology Interactions
👥 @elisabassignana.bsky.social @amandacurry.bsky.social @dirkhovy.bsky.social
🔗 arxiv.org/pdf/2505.12158
🎯We call for inclusive NLP technologies to accommodate different SES and mitigate the digital divide.
July 22, 2025 at 2:43 PM
📄 DECAF: A Dynamically Extensible Corpus Analysis Framework
👥 @mxij.me Rob van der Goot @annarogers.bsky.social
🔗 mxij.me/x/decaf
🎯 DECAF supports generalization research with clear train/test separation at scale.
July 22, 2025 at 2:43 PM
📄Identifying Open Challenges in Language Identification
👥 Rob van der Goot
🎯 We identify and quantify a variety of remaining challenges for language classification and identify cross-domain evaluation as a main bottleneck, where n-gram based models outperform transformer-based LMs.
July 22, 2025 at 2:43 PM
📄 Research Community Perspectives on ‘Intelligence’ and LLMs
👥 @brtrm.bsky.social @terne.bsky.social @annarogers.bsky.social @heinrichst.bsky.social
🔗 arxiv.org/abs/2505.20959
🎯 What do we mean when we talk about LLM 'intelligence'? Top criteria are generalization, adaptability & reasoning.
July 22, 2025 at 2:43 PM
📄 Hypernetworks for Personalizing ASR to Atypical Speech
👥 @mxij.me *, Dianna Yee*, Karren Yang, Gautam Varma Mantena, Colin Lea
🔗 doi.org/10.1162/tacl...
🎯 Using hypernetworks, we enable ASR for atypical speech via a mix of dynamic personalization and targeted adaptation sharing.
July 22, 2025 at 2:43 PM
📄 Efficient Elicitation of Fictitious Nursing Notes from Volunteer Healthcare Professionals
👥 Jesper Bornerup @chrha.bsky.social
🎯 We introduce a data collection method and Danish dataset of fictitious nursing notes by prompting volunteers with situations
🗓 March 3rd, 14.10 (Poster session)
March 2, 2025 at 6:30 AM
📄 DAKULTUR: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers
👥 @mxijme.bsky.social @mjjzha.bsky.social @elisabassignana.bsky.social
Peter Trolle, Rob van der Goot
🎯 A user study for analyzing the cultural awareness of LLMs in Danish
🗓 March 2nd, 10:30 (NB-REAL)
March 2, 2025 at 6:30 AM
📄 MORSED: Morphological Segmentation of Danish and its Effect on Language Modeling
👥 Rob, Anette, Emil, Mikkel, Nicolaj, @mjjzha.bsky.social @elisabassignana.bsky.social
🎯 We create morphological segmenters for Danish and using them for LM training
🗓 March 3rd, 14.10 (Poster session)
March 2, 2025 at 6:30 AM
📄SnakModel: Lessons Learned from Training an Open Danish Large Language Model
👥 @mjjzha.bsky.social @mxijme.bsky.social @elisabassignana.bsky.social Rob
🔗 shorturl.at/4PjoQ
🎯Snakmodel is an LLM continuously pre-trained on 13.6B Danish tokens and 3.7M instruction pairs
🗓March 3rd, 10.45 (Ida-Euroopa)
March 2, 2025 at 6:30 AM