Marlene Lutz
@marlutz.bsky.social
Phd student @ University of Mannheim | Social NLP | she/her
📄 Paper: arxiv.org/abs/2507.16076
Come chat with me about this at #EMNLP2025!
Huge thanks to my amazing collaborators
@indiiigo.bsky.social, @wanlo.bsky.social, @ Elisa Rogers, and @mstrohm.bsky.social!
7/7
Come chat with me about this at #EMNLP2025!
Huge thanks to my amazing collaborators
@indiiigo.bsky.social, @wanlo.bsky.social, @ Elisa Rogers, and @mstrohm.bsky.social!
7/7
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
Persona prompting is increasingly used in large language models (LLMs) to simulate views of various sociodemographic groups. However, how a persona prompt is formulated can significantly affect outcom...
arxiv.org
October 31, 2025 at 5:45 PM
📄 Paper: arxiv.org/abs/2507.16076
Come chat with me about this at #EMNLP2025!
Huge thanks to my amazing collaborators
@indiiigo.bsky.social, @wanlo.bsky.social, @ Elisa Rogers, and @mstrohm.bsky.social!
7/7
Come chat with me about this at #EMNLP2025!
Huge thanks to my amazing collaborators
@indiiigo.bsky.social, @wanlo.bsky.social, @ Elisa Rogers, and @mstrohm.bsky.social!
7/7
🌍 When simulating Hispanic personas, several LLMs spontaneously switched to Spanish — a pattern not seen for other groups.
This is the first systematic evidence of language-switching bias in persona prompting!
6/7
This is the first systematic evidence of language-switching bias in persona prompting!
6/7
October 31, 2025 at 5:45 PM
🌍 When simulating Hispanic personas, several LLMs spontaneously switched to Spanish — a pattern not seen for other groups.
This is the first systematic evidence of language-switching bias in persona prompting!
6/7
This is the first systematic evidence of language-switching bias in persona prompting!
6/7
📉 Bigger ≠ better.
Across all tasks and measures, larger models performed worse than smaller ones—sometimes even showing lower opinion alignment than a random baseline.
5/7
Across all tasks and measures, larger models performed worse than smaller ones—sometimes even showing lower opinion alignment than a random baseline.
5/7
October 31, 2025 at 5:45 PM
📉 Bigger ≠ better.
Across all tasks and measures, larger models performed worse than smaller ones—sometimes even showing lower opinion alignment than a random baseline.
5/7
Across all tasks and measures, larger models performed worse than smaller ones—sometimes even showing lower opinion alignment than a random baseline.
5/7
📊Interview-style prompting also significantly improves alignment with real-world survey data!
4/7
4/7
October 31, 2025 at 5:45 PM
📊Interview-style prompting also significantly improves alignment with real-world survey data!
4/7
4/7
💡The good news: Interview-style prompting (Q&A format) and name-based priming (using culturally associated names instead of explicit labels) consistently:
✅ Reduces stereotyping
✅ Improves diversity of responses
3/7
✅ Reduces stereotyping
✅ Improves diversity of responses
3/7
October 31, 2025 at 5:45 PM
💡The good news: Interview-style prompting (Q&A format) and name-based priming (using culturally associated names instead of explicit labels) consistently:
✅ Reduces stereotyping
✅ Improves diversity of responses
3/7
✅ Reduces stereotyping
✅ Improves diversity of responses
3/7
⚖️LLMs still fall short when simulating marginalized identities.
Simulations of nonbinary, Hispanic, and Middle Eastern personas are more stereotyped and less diverse than those of other groups.
2/7
Simulations of nonbinary, Hispanic, and Middle Eastern personas are more stereotyped and less diverse than those of other groups.
2/7
October 31, 2025 at 5:45 PM
⚖️LLMs still fall short when simulating marginalized identities.
Simulations of nonbinary, Hispanic, and Middle Eastern personas are more stereotyped and less diverse than those of other groups.
2/7
Simulations of nonbinary, Hispanic, and Middle Eastern personas are more stereotyped and less diverse than those of other groups.
2/7
Reposted by Marlene Lutz
Joint work w/ @marlutz.bsky.social, Elisa Rogers, @dgarcia.eu and @mstrohm.bsky.social
You can find our code and annotated dataset of papers here: github.com/Indiiigo/LLM...
We annotated way more things, e.g., LLM used, response format, so please check it out!
5/5
You can find our code and annotated dataset of papers here: github.com/Indiiigo/LLM...
We annotated way more things, e.g., LLM used, response format, so please check it out!
5/5
GitHub - Indiiigo/LLM_rep_review: Systematic Review of the Demographic Representativeness of LLMs
Systematic Review of the Demographic Representativeness of LLMs - Indiiigo/LLM_rep_review
github.com
July 21, 2025 at 10:12 AM
Joint work w/ @marlutz.bsky.social, Elisa Rogers, @dgarcia.eu and @mstrohm.bsky.social
You can find our code and annotated dataset of papers here: github.com/Indiiigo/LLM...
We annotated way more things, e.g., LLM used, response format, so please check it out!
5/5
You can find our code and annotated dataset of papers here: github.com/Indiiigo/LLM...
We annotated way more things, e.g., LLM used, response format, so please check it out!
5/5