cointegrated.bsky.social
@cointegrated.bsky.social
We (oldi.org) recently released version 3.0 of the FLORES+ dataset: a benchmark for multilingual machine translation.

In this version, we added Ladin language (now there are 222 language varieties in the dataset!), corrected the spelling for Chuvash and Dargwa, and fixed sentence order in Aranese.
July 5, 2025 at 1:14 PM