kentonmurray.bsky.social
@kentonmurray.bsky.social
Research Faculty at John’s Hopkins University. Focused on Multilingual NLP, Machine Translation, Low-Resource, Multimodal (videos)
Reposted
📊 Preliminary ranking of WMT 2025 General Machine Translation benchmark is here!

But don't draw conclusions just yet - automatic metrics are biased for techniques like metric as a reward model or MBR. The official human ranking will be part of General MT findings at WMT.

arxiv.org/abs/2508.14909
Preliminary Ranking of WMT25 General Machine Translation Systems
We present the preliminary ranking of the WMT25 General Machine Translation Shared Task, in which MT systems have been evaluated using automatic metrics. As this ranking is based on automatic evaluati...
arxiv.org
August 23, 2025 at 9:28 AM
Calling for participants in our workshop on retrieving videos. Large, pre-trained models do poorly on this and there’s a lot of potential to push the field without needing a ton of GPUs.
🚨 IT'S HERE! 🚨
The Eval Leaderboard is now LIVE! 🏆💻

Our video retrieval collection stumps most pre-trained models. See if you can build a better system!

eval.ai/web/challeng...
EvalAI: Evaluating state of the art in AI
EvalAI is an open-source web platform for organizing and participating in challenges to push the state of the art on AI tasks.
eval.ai
April 29, 2025 at 6:50 PM
Reposted
#NAACL2025 30 April - 2pm, Hall 3, Special Theme
"Faux Polyglots: A study on Information Disparity in Multilingual Large Language Models".

Come visit and learn about how multilingual RALMs fail to handle multilingual information conflicts.

Teaser: youtu.be/aPS2Ntav1FE

#LLM #AI #NLProc
Are multilingual LLMs ready for the challenges of real-world information-seeking where diverse perspectives and facts are represented in different languages?

Unfortunately, No.

We find LLMs are faux polyglots.

📢Preprint: tinyurl.com/fdunz3dz

#LLMs #NLProc
April 28, 2025 at 2:46 PM
There’s a lot of PhD advisors who are neglecting key parts of their jobs. Your students should NEVER have 100s of words of text on a slide, nor a grainy image of a LaTeX table (bar charts please!) A good talk is now the exception not the norm. It’s all conferences - not just #AAAI2025
February 28, 2025 at 5:07 PM
Decided last minute to come to #AAAI2025 for the day. Hit me up if you are in Philly.
February 28, 2025 at 2:55 PM
Reposted
Dialects lie on continua of (structured) linguistic variation, right? And we can’t collect data for every point on the continuum...🤔
📢 Check out DialUp, a technique to make your MT model robust to the dialect continua of its training languages, including unseen dialects.
arxiv.org/abs/2501.16581
February 27, 2025 at 2:44 AM
Reposted
We now have a Slack channel for the 2025 Workshop on Multimodal Augmented Generation via Multimodal Retrieval!

This channel will serve as the primary communication method between authors, participants, and organizers: join.slack.com/t/magmarshar...
Slack
join.slack.com
February 6, 2025 at 8:23 PM
Reposted
New Workshop on Multimodal Augmented Generation via MultimodAl Retrieval (MAGMaR) to be held at @aclmeeting.bsky.social ACL in Vienna this summer. We have a new shared task that stumps most LLMs - including ones pretrained on our test collection. nlp.jhu.edu/magmar/
MAGMaR Workshop
MAGMaR
nlp.jhu.edu
January 14, 2025 at 7:05 PM