Lightnews — Scholar-powered news

CIS, LMU Munich

@cislmu.bsky.social

Center for Information and Language Processing (CIS): NLP research group at LMU Munich led by Hinrich Schuetze and @barbaraplank.bsky.social

Posts Replies Media Videos

CIS, LMU Munich

@cislmu.bsky.social

🗨️ Beyond “noisy” text: How (and why) to process dialect data
🔎 Keynote talk at WNUT @ NAACL
👥 @verenablaschke.bsky.social
📁 Workshop on noisy and user-generated text (May 3)
The full workshop programme is here: noisy-text.github.io/2025/
bsky.app/profile/vere...

April 29, 2025 at 3:03 PM

CIS, LMU Munich

@cislmu.bsky.social

📝 Privacy-Preserving Federated Learning for Hate Speech Detection
🔎 We present a federated learning system with differential privacy and fine-tuned ALBERT models for low-resource hate speech detection.
👥 Ivo Júnior, @htyeh1, Axel Wisiorek, @HinrichSchuetze
📁 SRW - Long

April 29, 2025 at 3:03 PM

CIS, LMU Munich

@cislmu.bsky.social

📝 Linguistic Features in German BERT: The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification
🔎 Analysis of linguistic features used by German BERT in a classification task.
👥 Henrike Beyer (University of Dundee), Diego Frassinelli
📁 SRW - Short

April 29, 2025 at 3:03 PM

CIS, LMU Munich

@cislmu.bsky.social

📝 XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
🔎 a simple yet effective method to retrieve cross-lingual few-shot examples for multilingual in-context learning
👥 @lpq29743, @andre_t_martins, @HinrichSchuetze
🔗 arxiv.org/abs/2405.05116
📁 Finding - Short

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of...

arxiv.org

April 29, 2025 at 3:03 PM

CIS, LMU Munich

@cislmu.bsky.social

📝 Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum
🔎 We predict speech-to-text model performance on dialect continua with geostatistics.
👥 Ryan Soh-Eun Shim, Barbara Plank
🔗 arxiv.org/abs/2410.14589
📁Findings - Long

Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum

There is increasing interest in looking at dialects in NLP. However, most work to date still treats dialects as discrete categories. For instance, evaluative work in variation-oriented NLP for English...

arxiv.org

April 29, 2025 at 3:03 PM

CIS, LMU Munich

@cislmu.bsky.social

📝 A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
🔎An investigation of the impact of parallel corpora, ... on the performance of multilingual LLMs.
👥 @lpq29743, @andre_t_martins, @HinrichSchuetze
🔗 arxiv.org/abs/2407.00436
📁Finding - Long

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models

Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, an...

arxiv.org

April 29, 2025 at 3:03 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news