Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋
📋 Preprint: arxiv.org/abs/2406.04127
👨🏻💻 GitHub: github.com/aryopg/mmlu-...
🤗 HuggingFace: huggingface.co/datasets/edi...
📋 Preprint: arxiv.org/abs/2406.04127
👨🏻💻 GitHub: github.com/aryopg/mmlu-...
🤗 HuggingFace: huggingface.co/datasets/edi...
The result of months of work with the goal of advancing Multilingual LLM evaluation.
Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.