Vitalii Hirak
banner
v-hirak.bsky.social
Vitalii Hirak
@v-hirak.bsky.social
PhD student in Natural Language Processing and Information Retrieval at University at Düsseldorf. Working in the EmergentIR project at GESIS Cologne.
6/6: We hope our work will inspire further research on the intrinsic difficulty of translating and generating different languages in the age of LLMs, particularly through experimentation with alternative decoding strategies.

For now, I'm looking forward to presenting our work in Rabat, Morocco 🇲🇦
February 8, 2026 at 4:56 PM
5/6: In the context of
searching for the model’s highest-probability translation, we found that languages with more complex morphology and flexible word order benefit more from wider beam size.

In other words, the standard practice of left-to-right beam search may be suboptimal for these languages.
February 8, 2026 at 4:56 PM
4/6: Through correlation and regression experiments, we found that language properties like typological distance, type/token ratio, and head-finality drive translation quality of both NMT models, even after controlling for more trivial factors such as language resourcedness and script similarity.
February 8, 2026 at 4:56 PM
3/6: We analyze 2 NMT models, NLLB-200 and Tower+.

Although current SOTA has shifted to prompting decoder-only LLMs such as Tower+, we find that NLLB achieves higher chrF++ scores on all languages outside Tower's coverage, reaffirming the relevance of encoder-decoders for low-resourced languages.
February 8, 2026 at 4:56 PM
2/6: First, we compile a broad set of fine-grained typological and morphosyntactic features for 212 languages in the FLORES+ MT benchmark. We release this set publicly: github.com/v-hirak/expl...
February 8, 2026 at 4:56 PM
Henry Cavill is a creep though
June 17, 2025 at 10:54 AM
They aren't canonizing anything, this show is gonna be as canon as the millions of other people's playthroughs. It's just their take on the story
June 17, 2025 at 10:53 AM
Thank you from a Ukrainian, Kala, sincerely 🙏 I love your Mass Effect content
February 28, 2025 at 10:34 PM