Debora Nozza
@deboranozza.bsky.social
Assistant Professor at Bocconi University in MilaNLP group • Working in #NLP, #HateSpeech and #Ethics • She/her • #ERCStG PERSONAE
Reposted by Debora Nozza
#MemoryModay #NLProc 'Measuring Harmful Representations in Scandinavian Language Models' uncovers gender bias, challenging Scandinavia's equity image.
Measuring Harmful Representations in Scandinavian Language Models
Samia Touileb, Debora Nozza. Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS). 2022.
aclanthology.org
November 10, 2025 at 4:03 PM
#MemoryModay #NLProc 'Measuring Harmful Representations in Scandinavian Language Models' uncovers gender bias, challenging Scandinavia's equity image.
Reposted by Debora Nozza
Feeling a little sad not to be in Suzhou for #EMNLP2025, but so proud of all the amazing work from our MilaNLP Lab! 💫
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Proud to present our #EMNLP2025 papers!
Catch our team across Main, Findings, Workshops & Demos 👇
Catch our team across Main, Findings, Workshops & Demos 👇
November 5, 2025 at 6:07 PM
Feeling a little sad not to be in Suzhou for #EMNLP2025, but so proud of all the amazing work from our MilaNLP Lab! 💫
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Reposted by Debora Nozza
LLMs require social knowledge to understand implicit misogyny, yet they mostly fail. If you want to know more, come check my poster from 12.30 to 13.30!
Paper: aclanthology.org/2025.finding...
#EMNLP2025
Paper: aclanthology.org/2025.finding...
#EMNLP2025
Proud to present our #EMNLP2025 papers!
Catch our team across Main, Findings, Workshops & Demos 👇
Catch our team across Main, Findings, Workshops & Demos 👇
November 5, 2025 at 5:24 PM
LLMs require social knowledge to understand implicit misogyny, yet they mostly fail. If you want to know more, come check my poster from 12.30 to 13.30!
Paper: aclanthology.org/2025.finding...
#EMNLP2025
Paper: aclanthology.org/2025.finding...
#EMNLP2025
Reposted by Debora Nozza
#TBT #NLProc "Explaining Speech Classification Models" by Pastor et al. (2024) makes speech classification more transparent! 🔍 Their research reveals which words matter most and how tone and background noise impact decisions.
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features
Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy, Elena Baralis. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long...
aclanthology.org
November 6, 2025 at 4:04 PM
Feeling a little sad not to be in Suzhou for #EMNLP2025, but so proud of all the amazing work from our MilaNLP Lab! 💫
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Proud to present our #EMNLP2025 papers!
Catch our team across Main, Findings, Workshops & Demos 👇
Catch our team across Main, Findings, Workshops & Demos 👇
November 5, 2025 at 6:07 PM
Feeling a little sad not to be in Suzhou for #EMNLP2025, but so proud of all the amazing work from our MilaNLP Lab! 💫
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Honored to have received the Outstanding Senior Area Chair Award!
Check out our papers 👇
Reposted by Debora Nozza
#MemoryModay #NLProc 'Universal Joy: A Data Set and Results for Classifying Emotions Across Languages' by Lamprinidis et al. (2021) explores how AI research affects our planet.
Universal Joy A Data Set and Results for Classifying Emotions Across Languages
Sotiris Lamprinidis, Federico Bianchi, Daniel Hardt, Dirk Hovy. Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 2021.
aclanthology.org
November 3, 2025 at 4:02 PM
#MemoryModay #NLProc 'Universal Joy: A Data Set and Results for Classifying Emotions Across Languages' by Lamprinidis et al. (2021) explores how AI research affects our planet.
Reposted by Debora Nozza
Next week, I'll be at #EMNLP presenting our work "Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization" 🎉
📍 Ethics, Bias, and Fairness (Poster Session 2)
📅 Wed, November 5, 11:00-12:30 - Hall C
📖 Check the paper! arxiv.org/abs/2505.16467
See you in Suzhou! 👋
📍 Ethics, Bias, and Fairness (Poster Session 2)
📅 Wed, November 5, 11:00-12:30 - Hall C
📖 Check the paper! arxiv.org/abs/2505.16467
See you in Suzhou! 👋
October 31, 2025 at 7:56 PM
Next week, I'll be at #EMNLP presenting our work "Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization" 🎉
📍 Ethics, Bias, and Fairness (Poster Session 2)
📅 Wed, November 5, 11:00-12:30 - Hall C
📖 Check the paper! arxiv.org/abs/2505.16467
See you in Suzhou! 👋
📍 Ethics, Bias, and Fairness (Poster Session 2)
📅 Wed, November 5, 11:00-12:30 - Hall C
📖 Check the paper! arxiv.org/abs/2505.16467
See you in Suzhou! 👋
🚨 New main paper out at #EMNLP2025! 🚨
⚡ We show that personalization of content moderation models can be harmful and perpetuate hate speech, defeating the purpose of the system and hurting the community.
We argue that personalized moderation needs boundaries, and we show how to build them.
⚡ We show that personalization of content moderation models can be harmful and perpetuate hate speech, defeating the purpose of the system and hurting the community.
We argue that personalized moderation needs boundaries, and we show how to build them.
October 31, 2025 at 5:05 PM
🚨 New main paper out at #EMNLP2025! 🚨
⚡ We show that personalization of content moderation models can be harmful and perpetuate hate speech, defeating the purpose of the system and hurting the community.
We argue that personalized moderation needs boundaries, and we show how to build them.
⚡ We show that personalization of content moderation models can be harmful and perpetuate hate speech, defeating the purpose of the system and hurting the community.
We argue that personalized moderation needs boundaries, and we show how to build them.
Reposted by Debora Nozza
Proud to present our #EMNLP2025 papers!
Catch our team across Main, Findings, Workshops & Demos 👇
Catch our team across Main, Findings, Workshops & Demos 👇
October 31, 2025 at 2:04 PM
Proud to present our #EMNLP2025 papers!
Catch our team across Main, Findings, Workshops & Demos 👇
Catch our team across Main, Findings, Workshops & Demos 👇
Reposted by Debora Nozza
🗓️ Nov 5 – Main Conference Posters
Personalization up to a Point
🧠 In the context of content moderation, we show that fully personalized models can perpetuate hate speech, and propose a policy-based method to impose legal boundaries.
📍 Hall C | 11:00–12:30
Personalization up to a Point
🧠 In the context of content moderation, we show that fully personalized models can perpetuate hate speech, and propose a policy-based method to impose legal boundaries.
📍 Hall C | 11:00–12:30
October 31, 2025 at 2:05 PM
🗓️ Nov 5 – Main Conference Posters
Personalization up to a Point
🧠 In the context of content moderation, we show that fully personalized models can perpetuate hate speech, and propose a policy-based method to impose legal boundaries.
📍 Hall C | 11:00–12:30
Personalization up to a Point
🧠 In the context of content moderation, we show that fully personalized models can perpetuate hate speech, and propose a policy-based method to impose legal boundaries.
📍 Hall C | 11:00–12:30
Reposted by Debora Nozza
🗓️ Nov 5 – Main Conference Posters
📘 Biased Tales
A dataset of 5k short LLM bedtime stories generated across sociocultural axes with an evaluation taxonomy for character-centric attributes and context-centric attributes.
📍 Hall C | 11:00–12:30
📘 Biased Tales
A dataset of 5k short LLM bedtime stories generated across sociocultural axes with an evaluation taxonomy for character-centric attributes and context-centric attributes.
📍 Hall C | 11:00–12:30
October 31, 2025 at 2:05 PM
🗓️ Nov 5 – Main Conference Posters
📘 Biased Tales
A dataset of 5k short LLM bedtime stories generated across sociocultural axes with an evaluation taxonomy for character-centric attributes and context-centric attributes.
📍 Hall C | 11:00–12:30
📘 Biased Tales
A dataset of 5k short LLM bedtime stories generated across sociocultural axes with an evaluation taxonomy for character-centric attributes and context-centric attributes.
📍 Hall C | 11:00–12:30
Reposted by Debora Nozza
🗓️ Nov 5 - Demo
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification
🧩 Co-DETECT – an iterative, human-LLM collaboration framework for surfacing edge cases and refining annotation codebooks in text classification.
📍 Demo Session 2 – Hall C3 | 14:30–16:00
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification
🧩 Co-DETECT – an iterative, human-LLM collaboration framework for surfacing edge cases and refining annotation codebooks in text classification.
📍 Demo Session 2 – Hall C3 | 14:30–16:00
October 31, 2025 at 2:06 PM
🗓️ Nov 5 - Demo
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification
🧩 Co-DETECT – an iterative, human-LLM collaboration framework for surfacing edge cases and refining annotation codebooks in text classification.
📍 Demo Session 2 – Hall C3 | 14:30–16:00
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification
🧩 Co-DETECT – an iterative, human-LLM collaboration framework for surfacing edge cases and refining annotation codebooks in text classification.
📍 Demo Session 2 – Hall C3 | 14:30–16:00
Reposted by Debora Nozza
🗓️ Nov 6 – Findings Posters
The “r” in “woman” stands for rights.
💬 We propose a taxonomy of social dynamics in implicit misogyny (EN,IT), auditing 9 LLMs — and they consistently fail. The more social knowledge a message requires, the worse they perform.
📍 Hall C | 12:30–13:30
The “r” in “woman” stands for rights.
💬 We propose a taxonomy of social dynamics in implicit misogyny (EN,IT), auditing 9 LLMs — and they consistently fail. The more social knowledge a message requires, the worse they perform.
📍 Hall C | 12:30–13:30
October 31, 2025 at 2:06 PM
🗓️ Nov 6 – Findings Posters
The “r” in “woman” stands for rights.
💬 We propose a taxonomy of social dynamics in implicit misogyny (EN,IT), auditing 9 LLMs — and they consistently fail. The more social knowledge a message requires, the worse they perform.
📍 Hall C | 12:30–13:30
The “r” in “woman” stands for rights.
💬 We propose a taxonomy of social dynamics in implicit misogyny (EN,IT), auditing 9 LLMs — and they consistently fail. The more social knowledge a message requires, the worse they perform.
📍 Hall C | 12:30–13:30
Reposted by Debora Nozza
🗓️ Nov 7 – Main Conference Posters
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
🧍 Discussing different applications for LLM persona prompting, and how to measure their success.
📍 Hall C | 10:30–12:00
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
🧍 Discussing different applications for LLM persona prompting, and how to measure their success.
📍 Hall C | 10:30–12:00
October 31, 2025 at 2:06 PM
🗓️ Nov 7 – Main Conference Posters
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
🧍 Discussing different applications for LLM persona prompting, and how to measure their success.
📍 Hall C | 10:30–12:00
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
🧍 Discussing different applications for LLM persona prompting, and how to measure their success.
📍 Hall C | 10:30–12:00
Reposted by Debora Nozza
🗓️ Nov 7 – Main Conference Posters
TrojanStego: Your Language Model Can Secretly Be a Steganographic Privacy-Leaking Agent
🔒 LLMs can be fine-tuned to leak secrets via token-based steganography!
📍 Hall C | 10:30–12:00
TrojanStego: Your Language Model Can Secretly Be a Steganographic Privacy-Leaking Agent
🔒 LLMs can be fine-tuned to leak secrets via token-based steganography!
📍 Hall C | 10:30–12:00
October 31, 2025 at 2:06 PM
🗓️ Nov 7 – Main Conference Posters
TrojanStego: Your Language Model Can Secretly Be a Steganographic Privacy-Leaking Agent
🔒 LLMs can be fine-tuned to leak secrets via token-based steganography!
📍 Hall C | 10:30–12:00
TrojanStego: Your Language Model Can Secretly Be a Steganographic Privacy-Leaking Agent
🔒 LLMs can be fine-tuned to leak secrets via token-based steganography!
📍 Hall C | 10:30–12:00
Reposted by Debora Nozza
🗓️ Nov 8 – WiNLP Workshops
No for Some, Yes for Others
🤖 We investigate how sociodemographic persona prompts affect false refusal behaviors in LLMs. Model and task type are the dominant factors driving these refusals.
No for Some, Yes for Others
🤖 We investigate how sociodemographic persona prompts affect false refusal behaviors in LLMs. Model and task type are the dominant factors driving these refusals.
October 31, 2025 at 2:06 PM
🗓️ Nov 8 – WiNLP Workshops
No for Some, Yes for Others
🤖 We investigate how sociodemographic persona prompts affect false refusal behaviors in LLMs. Model and task type are the dominant factors driving these refusals.
No for Some, Yes for Others
🤖 We investigate how sociodemographic persona prompts affect false refusal behaviors in LLMs. Model and task type are the dominant factors driving these refusals.
Reposted by Debora Nozza
🗓️ Nov 8 – NLPerspectives Workshops
Balancing Quality and Variation
🧮 For datasets to represent diverse opinions, they must preserve variation while filtering out spam. We evaluate annotator filtering heuristics and show how they often remove genuine variation.
Balancing Quality and Variation
🧮 For datasets to represent diverse opinions, they must preserve variation while filtering out spam. We evaluate annotator filtering heuristics and show how they often remove genuine variation.
October 31, 2025 at 2:07 PM
🗓️ Nov 8 – NLPerspectives Workshops
Balancing Quality and Variation
🧮 For datasets to represent diverse opinions, they must preserve variation while filtering out spam. We evaluate annotator filtering heuristics and show how they often remove genuine variation.
Balancing Quality and Variation
🧮 For datasets to represent diverse opinions, they must preserve variation while filtering out spam. We evaluate annotator filtering heuristics and show how they often remove genuine variation.
Reposted by Debora Nozza
🗓️ Nov 8 – BabyLM Workshop
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction
👶 ContingentChat, a Teacher–Student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words.
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction
👶 ContingentChat, a Teacher–Student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words.
October 31, 2025 at 2:07 PM
🗓️ Nov 8 – BabyLM Workshop
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction
👶 ContingentChat, a Teacher–Student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words.
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction
👶 ContingentChat, a Teacher–Student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words.
Reposted by Debora Nozza
🗓️ Nov 8 – STARSEM Workshop
Generalizability of Media Frames: Corpus Creation and Analysis Across Countries
📰 We investigate how well media frames generalize across different media landscapes. The 15 MFC frames remain broadly applicable, with minor revisions of the guidelines.
Generalizability of Media Frames: Corpus Creation and Analysis Across Countries
📰 We investigate how well media frames generalize across different media landscapes. The 15 MFC frames remain broadly applicable, with minor revisions of the guidelines.
October 31, 2025 at 2:07 PM
🗓️ Nov 8 – STARSEM Workshop
Generalizability of Media Frames: Corpus Creation and Analysis Across Countries
📰 We investigate how well media frames generalize across different media landscapes. The 15 MFC frames remain broadly applicable, with minor revisions of the guidelines.
Generalizability of Media Frames: Corpus Creation and Analysis Across Countries
📰 We investigate how well media frames generalize across different media landscapes. The 15 MFC frames remain broadly applicable, with minor revisions of the guidelines.
Reposted by Debora Nozza
🗓️ Nov 6 – Oral Presentation (TACL)
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
⚖️ A foundation for measuring LLM political bias in realistic user conversations.
📍 A303 | 10:30–12:00
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
⚖️ A foundation for measuring LLM political bias in realistic user conversations.
📍 A303 | 10:30–12:00
October 31, 2025 at 2:07 PM
🗓️ Nov 6 – Oral Presentation (TACL)
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
⚖️ A foundation for measuring LLM political bias in realistic user conversations.
📍 A303 | 10:30–12:00
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
⚖️ A foundation for measuring LLM political bias in realistic user conversations.
📍 A303 | 10:30–12:00
Reposted by Debora Nozza
📊 The #DSA data access portal is live, and VLOPs/VLOSEs have begun publishing their data catalogues.
Trying to collect the links again: docs.google.com/spreadsheets...
Trying to collect the links again: docs.google.com/spreadsheets...
LinkedIn
This link will take you to a page that’s not on LinkedIn
lnkd.in
October 29, 2025 at 3:40 PM
📊 The #DSA data access portal is live, and VLOPs/VLOSEs have begun publishing their data catalogues.
Trying to collect the links again: docs.google.com/spreadsheets...
Trying to collect the links again: docs.google.com/spreadsheets...
Reposted by Debora Nozza
Great session today in our lab reading group. Thanks to Emanuele Moscato for presenting the article “Universities are embracing AI: will students get smarter or stop thinking?” from @naturemagazine.bsky.social.
Article: www.nature.com/articles/d41...
#NLProc
Article: www.nature.com/articles/d41...
#NLProc
October 30, 2025 at 1:35 PM
Great session today in our lab reading group. Thanks to Emanuele Moscato for presenting the article “Universities are embracing AI: will students get smarter or stop thinking?” from @naturemagazine.bsky.social.
Article: www.nature.com/articles/d41...
#NLProc
Article: www.nature.com/articles/d41...
#NLProc
Reposted by Debora Nozza
#TBT #NLProc Explore 'Wisdom of Instruction-Tuned LLM Crowds' by Plaza et al. LLM labels outperform single models in tasks & languages. But few-shot can't top zero-shot. Supervised models rule.
Wisdom of Instruction-Tuned Language Model Crowds. Exploring Model Label Variation
Flor Miriam Plaza-del-Arco, Debora Nozza, Dirk Hovy. Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024. 2024.
aclanthology.org
October 30, 2025 at 4:05 PM
Reposted by Debora Nozza
There’s plenty of evidence for political bias in LLMs, but very few evals reflect realistic LLM use cases — which is where bias actually matters.
IssueBench, our attempt to fix this, is accepted at TACL, and I will be at #EMNLP2025 next week to talk about it!
New results 🧵
IssueBench, our attempt to fix this, is accepted at TACL, and I will be at #EMNLP2025 next week to talk about it!
New results 🧵
Are LLMs biased when they write about political issues?
We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.
Long 🧵with spicy results 👇
We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.
Long 🧵with spicy results 👇
October 29, 2025 at 4:12 PM
There’s plenty of evidence for political bias in LLMs, but very few evals reflect realistic LLM use cases — which is where bias actually matters.
IssueBench, our attempt to fix this, is accepted at TACL, and I will be at #EMNLP2025 next week to talk about it!
New results 🧵
IssueBench, our attempt to fix this, is accepted at TACL, and I will be at #EMNLP2025 next week to talk about it!
New results 🧵