Anna Seo Gyeong Choi
Anna Seo Gyeong Choi
@annaseogyeongchoi.bsky.social
phd @ cornell information science
We can decompose performance degradation by individual grammar rules.
Three rules – existential “it”, zero copula, and y’all – account for roughly half of a dialect’s accuracy decreases, relative to Standard American English accuracy.
November 6, 2025 at 12:10 AM
We used the Multi-VALUE package to transform Standard American English questions from QA datasets into dialectal variants based on grammatical rules.
November 6, 2025 at 12:10 AM
We studied 6 English dialects (African American, Appalachian, Chicano, Indian, Singaporean, Southern) across 3 LLMs using 3 multiple-choice QA benchmarks.
The question: Do dialects affect performance even on easy tasks?
Answer: YES, with worst performance on Singaporean English.
November 6, 2025 at 12:08 AM
🧵Excited to present our work at #EMNLP2025 “Analyzing Dialectal Biases in LLMs for Knowledge and Reasoning Benchmarks”!
Paper 📄 arxiv.org/abs/2510.00962
w/ Eileen Pan, Skyler Seto, @allisonkoe.bsky.social @maartjeterhoeve.bsky.social
November 6, 2025 at 12:08 AM