Low resource languages | culture-aware LLMs | machine-generated test detection
Beemo is one of the first benchmarks designed to evaluate AI detector performance on mixed-authorship content — texts generated by LLMs and edited by human experts or refined by other models.
Beemo is one of the first benchmarks designed to evaluate AI detector performance on mixed-authorship content — texts generated by LLMs and edited by human experts or refined by other models.
We invite systems that can predict human preferences for different LLM outputs and explain their predictions across five criteria: relevance, naturalness, truthfulness, safety, and overall quality.
Dev Stage: Feb 3 – Mar 2
We invite systems that can predict human preferences for different LLM outputs and explain their predictions across five criteria: relevance, naturalness, truthfulness, safety, and overall quality.
Dev Stage: Feb 3 – Mar 2
You’ll learn how to speed up data annotation and reduce costs and human workload.
Discover more: toloka.ai/events/tolok...
You’ll learn how to speed up data annotation and reduce costs and human workload.
Discover more: toloka.ai/events/tolok...
Main Link | Techmeme Permalink
Main Link | Techmeme Permalink