Jindra Helcl
banner
jindrahelcl.bsky.social
Jindra Helcl
@jindrahelcl.bsky.social
Reposted by Jindra Helcl
We’re collecting crowd-sourced translations in Piedmontese and Neapolitan.
🎯 Goal: see how well LLMs understand these languages.
👉 Participate here (in IT🇮🇹):
- Piedmontese: quest.ms.mff.cuni.cz/crowd-transl...
- Neapolitan: quest.ms.mff.cuni.cz/crowd-transl...
Anyone can join, no need to be fluent!
Welcome to CrowdTranslation
quest.ms.mff.cuni.cz
November 10, 2025 at 2:10 PM
Reposted by Jindra Helcl
Attenzione! 🇮🇹 Know Piedmontese or Neapolitan speakers? @gianlucavico.bsky.social is collecting crowd-sourced translations to evaluate LLM performance on these regional languages. Partecipate!
We’re collecting crowd-sourced translations in Piedmontese and Neapolitan.
🎯 Goal: see how well LLMs understand these languages.
👉 Participate here (in IT🇮🇹):
- Piedmontese: quest.ms.mff.cuni.cz/crowd-transl...
- Neapolitan: quest.ms.mff.cuni.cz/crowd-transl...
Anyone can join, no need to be fluent!
Welcome to CrowdTranslation
quest.ms.mff.cuni.cz
November 10, 2025 at 2:36 PM
Does your model know the difference between koprovka and kulajda? 🍽️ Does it recognize famous Ukrainians from their statues? 🗽 And what if you ask in Slovak? 😱 Check out our new regional QA dataset and find out!! 🤯

Now available on Hugging Face huggingface.co/datasets/ufa...
🧵 We're releasing CUS-QA - a new benchmark for testing LLMs on regional knowledge!
Find out what your model knows about Czechia 🇨🇿, Slovakia 🇸🇰, and Ukraine 🇺🇦!
👉 Textual and visual questions, answers, and human judgment on model outputs!
huggingface.co/datasets/ufa...
www.arxiv.org/abs/2507.22752
ufal/cus-qa · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
September 2, 2025 at 11:19 AM
We need to have poster fights at the end of every conference.
July 29, 2025 at 7:01 PM
Reposted by Jindra Helcl
📢I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?sho....
Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.
Postdoc in Natural Language Processing
employment.ku.dk
July 7, 2025 at 12:47 PM
Reposted by Jindra Helcl
📢 First release: 38 monolingual reference LLMs (2.15B params) via #HPLT + #OpenEuroLLM

⚙️Trained on 100B tokens from HPLT v2 dataset
🌍 Cover EU langs + others
⚙️ Based on LLaMA, trained on #LUMI
📈 Useful for evaluation

Downloads + more info at openeurollm.eu/blog/hplt-oe...
July 18, 2025 at 9:32 AM
Reposted by Jindra Helcl
this "class 9" is such a cool idea for an LLM course!

(from ufal.mff.cuni.cz/courses/npfl... via @zdenekkasner.bsky.social )
June 8, 2025 at 8:45 PM
Petition against renaming the Českomoravská Metro station, sign and share! (Czech ID needed)
gov.cz/e-petice/118...
May 28, 2025 at 7:08 AM
Am I the only one to think that these should always be aligned with the direction of travel? (Especially if you already have more than one version of these and the trains never turn.)
May 14, 2025 at 4:41 PM
Reposted by Jindra Helcl
I'm part of this! There's also a paper: arxiv.org/abs/2503.10267
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
HPLT - High Performance Language Technologies
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
March 17, 2025 at 1:27 PM
Come to MT Marathon! Always a great fun and this year's marathon in Helsinki is not going to be an exception! See everyone there! 💥❤️
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
March 19, 2025 at 9:48 AM
Reposted by Jindra Helcl
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
March 18, 2025 at 1:10 PM
Reposted by Jindra Helcl
Kick-off successfully completed. Go OpenEuroLLM team!
openeurollm.eu
March 7, 2025 at 2:29 PM
Hello world!
February 3, 2025 at 6:19 PM