navitas.bsky.social
navitas.bsky.social
@navitas.bsky.social
Reposted by navitas.bsky.social
Today, @tuetschek.bsky.social shared the work of his team on evaluating LLM text generation with both human annotation frameworks and LLM-based metrics. Their approach tackles the benchmark data leakage problem and how to get unseen data for unbiased LLM testing.
April 30, 2025 at 12:02 PM
Reposted by navitas.bsky.social
The 👉Machine Learning Prague 2025👈 is happening right now! Today, @patuchen.bsky.social and @navitas.bsky.social presented their posters on text generation with LLMs. Also, don't miss @tuetschek.bsky.social's invited talk tomorrow at 11 a.m.
April 29, 2025 at 2:08 PM
Reposted by navitas.bsky.social
How do LLMs compare to human crowdworkers in annotating text spans? 🧑🤖

And how can span annotation help us with evaluating texts?

Find out in our new paper: llm-span-annotators.github.io

Arxiv: arxiv.org/abs/2504.08697
Large Language Models as Span Annotators
Website for the paper Large Language Models as Span Annotators
llm-span-annotators.github.io
April 15, 2025 at 11:10 AM