Raphaël Merx
rapha.dev
Raphaël Merx
@rapha.dev
PhD @ UniMelb
NLP, with a healthy dose of MT

Based in 🇮🇩, worked in 🇹🇱 🇵🇬 , from 🇫🇷
www2.statmt.org
October 18, 2025 at 5:17 AM
They say it's because (1) test sets have become more challenging, (2) include more lang pairs, (3) are longer, and (4) used ESA instead of MQM. But we need an ablation study!
October 18, 2025 at 5:17 AM
kudos to whoever came up with that paper name 👌
October 6, 2025 at 8:44 AM
Thanks a lot! I didn't make it to Albuquerque unfortunately, but I hope to be in Vienna for ACL. Might see you there?
May 26, 2025 at 2:25 AM
Many thanks to Adérito Correia (Timor-Leste INL), and my supervisors Hanna Suominen Katerina Vylomova!

Paper at aclanthology.org/2025.loresmt... , video presentation at youtu.be/8zenieJWRyg
Low-resource Machine Translation: what for? who for? An observational study on a dedicated Tetun language translation service
Raphael Merx, Adérito José Guterres Correia, Hanna Suominen, Ekaterina Vylomova. Proceedings of the Eighth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2025). 20...
aclanthology.org
May 25, 2025 at 1:11 AM
(3) The vast majority of usage is on mobile (over 90% of users / over 80k devices)

Takeaway: publishing MT model in mobile apps is probably more impactful than setting up a website / HuggingFace space.
May 25, 2025 at 1:11 AM
(2) Translation into Tetun is in higher demand (by >2x) than translation from Tetun

Takeaway for us MT folks: focus on translation into low-res langs, harder but more impactful
May 25, 2025 at 1:11 AM
We find that
(1) a LOT of usage is for educational purposes (>50% of translated text)
--> contrasts sharply with Tetun corpora (e.g. MADLAD), dominated by news & religion.

Takeaway: don't evaluate MT on overrepresented domains (e.g. religion)! You risk misrepresenting end-user exp.
May 25, 2025 at 1:11 AM
Very interesting findings, particularly the benefit (or lack thereof) of test-time scaling across domains
May 13, 2025 at 12:40 AM
AI dev tools. In particular agents: are they hype or useful or both?
March 31, 2025 at 3:20 AM
Perceptricon
March 26, 2025 at 8:29 AM
The right thing to do, thanks for this *SEM
March 17, 2025 at 8:19 AM
Super impactful, thank you for this! A natural sequel of Gatitos.

I'm esp. fond of your "researcher in the loop" method to ensure wide vocab coverage.
February 20, 2025 at 10:23 PM