Tanel Alumäe
tanelalumae.bsky.social
Tanel Alumäe
@tanelalumae.bsky.social
Associate Professor of Speech Processing
Tallinn University of Technology, Estonia
"And I'll see the day that anyone gives us #1 without being forced to do so ..."

There are many LLM projects that are open about training and evaluation data, such as AllenAI OLMo, several EU projects (EuroGPT, HPLT), and several Huggingface projects. I don't think anybody forced them to do so.
January 30, 2025 at 11:19 AM
Great challenge but very little time...

What is the maximum length of a test utterance (important considering limited GPU RAM on the test server)?

Is ASR CER case sensitive? Are spaces taken into account when computing CER?
December 4, 2024 at 10:15 PM
What is the maximum length of a test utterance (important considering limited GPU RAM on the test server)?

Is ASR CER case sensitive? Are spaces taken into account when computing CER?
December 1, 2024 at 10:52 AM
Very interesting challenge! Unfortunately there is very little time, considering that participants would have to prepare some kind of container that decodes the test data on the Dynabench server.
Some questions follow...
December 1, 2024 at 10:45 AM