NLP research group @IDSIA
banner
idsianlp.bsky.social
NLP research group @IDSIA
@idsianlp.bsky.social
Presenting at #EMNLP2025 in a moment, session on "Multilinguality and Language Diversity 2" (A301). Our paper on Tokenization Fairness: arxiv.org/abs/2509.20045
Tokenization and Representation Biases in Multilingual Models on Dialectal NLP Tasks
Dialectal data are characterized by linguistic variation that appears small to humans but has a significant impact on the performance of models. This dialect gap has been related to various factors (e...
arxiv.org
November 6, 2025 at 9:32 AM
This is the inaugural post for the idsianlp account on Bluesky.
March 8, 2025 at 4:10 PM