Our work arxiv.org/abs/2506.00628 (Interspeech '25) finds that *accent-language confusion* is an important culprit, ties it to the length of feature that the model relies on, and proposes a fix.
Our work arxiv.org/abs/2506.00628 (Interspeech '25) finds that *accent-language confusion* is an important culprit, ties it to the length of feature that the model relies on, and proposes a fix.
Saw some magnolias too :)
Saw some magnolias too :)
A decision tree trained on features describing language similarity, baseline performance, and language resourcedness tells us that baseline performance is the most important factor in explaining gains.
A decision tree trained on features describing language similarity, baseline performance, and language resourcedness tells us that baseline performance is the most important factor in explaining gains.
Most lexicons do, which is why we had to curate our own function word lexicons using statistical alignment on small bitext.
Most lexicons do, which is why we had to curate our own function word lexicons using statistical alignment on small bitext.
DialUp helps a lot for some families and languages! 9 languages, mostly from the Indic and Romance families, show gains of 10+ BLEU points with M2M.
DialUp helps a lot for some families and languages! 9 languages, mostly from the Indic and Romance families, show gains of 10+ BLEU points with M2M.
We separate out content and functional words, because these behave differently in how they vary between dialects.
We separate out content and functional words, because these behave differently in how they vary between dialects.
We finetune on this data, and evaluate on actual dialects. That’s M—>D.
We finetune on this data, and evaluate on actual dialects. That’s M—>D.
(In this paper: aclanthology.org/2024.emnlp-m...)
Briefly: we add linguistically motivated noise on top of HRL text, in order to mimic dialectal variation.
(In this paper: aclanthology.org/2024.emnlp-m...)
Briefly: we add linguistically motivated noise on top of HRL text, in order to mimic dialectal variation.
📢 Check out DialUp, a technique to make your MT model robust to the dialect continua of its training languages, including unseen dialects.
arxiv.org/abs/2501.16581
📢 Check out DialUp, a technique to make your MT model robust to the dialect continua of its training languages, including unseen dialects.
arxiv.org/abs/2501.16581
A decision tree on features describing language similarity, baseline performance, and resourcedness tells us that baseline performance is the most important factor explaining gains.
A decision tree on features describing language similarity, baseline performance, and resourcedness tells us that baseline performance is the most important factor explaining gains.
Most lexicons do, which is why we had to curate our own function word lexicons using statistical alignment on small bitext.
Most lexicons do, which is why we had to curate our own function word lexicons using statistical alignment on small bitext.