vamvas.ch
From Qwen3-8B-Base
✅ 100K synthetic problems: better than Qwen3-8B
✅ Combining with human written problems: matches DeepSeek-R1-671B
🧵(1/5)
From Qwen3-8B-Base
✅ 100K synthetic problems: better than Qwen3-8B
✅ Combining with human written problems: matches DeepSeek-R1-671B
🧵(1/5)
aclanthology.org/2025.acl-lon...
aclanthology.org/2025.acl-lon...
If you’re at #ACL, stop by to learn more!
If you’re at #ACL, stop by to learn more!
Excited to present papers with @vamvas.bsky.social @ricosennrich.bsky.social on Unsupervised Translation Direction Detection and Multilingual Hallucination Detection!
Come say hi! 👋
#NLProc #NLP #NMT #LLMs
Excited to present papers with @vamvas.bsky.social @ricosennrich.bsky.social on Unsupervised Translation Direction Detection and Multilingual Hallucination Detection!
Come say hi! 👋
#NLProc #NLP #NMT #LLMs
I've come to believe that multiple-choice exams are underrated. More in my blog post, “The Joy of Multiple-Choice.” vamvas.ch/the-joy-of-m...
I've come to believe that multiple-choice exams are underrated. More in my blog post, “The Joy of Multiple-Choice.” vamvas.ch/the-joy-of-m...
We are honored to receive Best Paper Award for it! ✨
We are honored to receive Best Paper Award for it! ✨
Michelle's paper: arxiv.org/abs/2401.06769
Demo: huggingface.co/spaces/Zuric...
If you're at the expo, make sure to stop by the Department of Computational Linguistics UZH!
Michelle's paper: arxiv.org/abs/2401.06769
Demo: huggingface.co/spaces/Zuric...
If you're at the expo, make sure to stop by the Department of Computational Linguistics UZH!
I was curious how GPT-4o can make use of predicted outputs to speed up text generation.
vamvas.ch/openai-predi...
I was curious how GPT-4o can make use of predicted outputs to speed up text generation.
vamvas.ch/openai-predi...
@vamvas.bsky.social and @ricosennrich.bsky.social
Paper link:
arxiv.org/pdf/2503.10494
Long context LLMs have paved the way for document translation, but is simply inputting the whole content the optimal way?
Here's the thread 🧵 [1/n]
@vamvas.bsky.social and @ricosennrich.bsky.social
Paper link:
arxiv.org/pdf/2503.10494
Long context LLMs have paved the way for document translation, but is simply inputting the whole content the optimal way?
Here's the thread 🧵 [1/n]
Rico is my former advisor and I can greatly recommend working with him. Apply by January 4: jobs.uzh.ch/offene-stell...
Rico is my former advisor and I can greatly recommend working with him. Apply by January 4: jobs.uzh.ch/offene-stell...
However, a generic implementation for @huggingface.bsky.social Transformers has been missing. Check out our new 𝗺𝗯𝗿🔥 repo, which is designed to work with any model and metric on the Hub: github.com/ZurichNLP/mbr
However, a generic implementation for @huggingface.bsky.social Transformers has been missing. Check out our new 𝗺𝗯𝗿🔥 repo, which is designed to work with any model and metric on the Hub: github.com/ZurichNLP/mbr
Our EMNLP paper investigates the task of Recognizing Semantic Differences (RSD) with simple, unsupervised approaches.
• Demo: huggingface.co/spaces/Zuric...
• Paper: huggingface.co/papers/2305....
#NLProc
Our EMNLP paper investigates the task of Recognizing Semantic Differences (RSD) with simple, unsupervised approaches.
• Demo: huggingface.co/spaces/Zuric...
• Paper: huggingface.co/papers/2305....
#NLProc