Anar
anaryegen.bsky.social
Anar
@anaryegen.bsky.social
PhDing at HiTZ Center, University of the Basque Country. NLP | LLM Factuality, multilinguality and more

anaryegen.github.io
Reposted by Anar
In last week’s seminar session @anaryegen.bsky.social talked about Mining Argument Structures in Medical Texts (aclanthology.org/2024.emnlp-m...) and Enhancing Factuality in Counter-argument Generation with real-time Knowledge (to be presented at #ACL2025NLP btw, preprint: arxiv.org/pdf/2503.05328)
June 23, 2025 at 10:05 AM
Reposted by Anar
🧙‍♂️ New paper 🧙‍♀️:
Presenting Wicked: a simple automated method to make MCQA benchmarks more challenging. Wicked shook up 18 open-weight LLMs on 6 benchmarks, with up to 19.7% performance drop with direct prompting 🤯
Paper: shorturl.at/1CGq0
Code: shorturl.at/n2nCU
February 26, 2025 at 11:51 AM
Reposted by Anar
I couldn't be happier! Preparing the trip to #NeurIPS2024 we received the notification of acceptance of our paper in #COLING2025 with Paula Ontalvilla and Aitor Ormazabal. Paula has just started with her PhD at @hitz-zentroa.bsky.social and she is already publishing at this level!
December 2, 2024 at 9:21 AM
Reposted by Anar
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B — As always, we released our data, code, recipes and more 🎁
November 26, 2024 at 8:51 PM
Reposted by Anar
Looking at ICLR submissions with the lowest score - What a work of art! 🧵
November 25, 2024 at 5:52 PM
Reposted by Anar
We launched Judge Arena with @huggingface.bsky.social
@clefourrier.bsky.social - a platform that lets you easily compare models as judges side-by-side and vote for the best evaluation

Check out the live leaderboard and start voting now 🤗
November 19, 2024 at 7:08 PM