Katya Artemova
katya-art.bsky.social
Katya Artemova
@katya-art.bsky.social
#NLProc researcher @ Toloka AI, ex LMU, ex HSE

Low resource languages | culture-aware LLMs | machine-generated test detection
Poster Session 8 - R&E; Hall 3, May 2, 11-12:30
April 29, 2025 at 7:52 PM
Thank you for your answer! I co-authored RuBLIMP, so I was curious of your take on this and whether you had the same experience — when we first run the experiments without decontamination the results seemed super inflated. But it seems like for low resource languages it’s not the case than.
April 21, 2025 at 7:32 PM
Hi! Great work and thanks for sharing! I wonder if there is a chance all these LLMs have been trained on the UD data? Aren’t they contaminated?
April 17, 2025 at 5:47 PM
Check out our repo: github.com/eloquent-lab... !
github.com
February 11, 2025 at 7:44 PM
Co-organizers: Akim Tsvigun (University of Amsterdam and Nebius), Dominik Schlechtweg (University of Stuttgart), with Natalia Fedorova, Boris Obmoroshev, Sergei Tilga, Ekaterina Artemova, and Konstantin Chernyshev from Toloka
January 7, 2025 at 1:10 PM
Such a good thread idea!

arxiv.org/abs/2305.10284

"Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks" by Anas Himmi et al. They explore ranking LLMs is required where some scores for certain tasks are missing. The Borda count constructs reliable leaderboards.
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
The evaluation of natural language processing (NLP) systems is crucial for advancing the field, but current benchmarking approaches often assume that all systems have scores available for all tasks, w...
arxiv.org
November 27, 2024 at 10:12 AM