Lightnews — Scholar-powered news

Alex Gill

@agill32.bsky.social

390 followers 360 following 9 posts

NLP researcher at U of U

Posts Replies Media Videos

Reposted by Alex Gill

Nathan Kalman-Lamb

@nkalamb.bsky.social

Folks, I don’t know how it’s possible, but it gets funnier.

November 21, 2025 at 3:19 PM

Alex Gill

@agill32.bsky.social

I'll be in Suzhou 🇨🇳 at #EMNLP this week presenting "What has been Lost with Synthetic Evaluation?" done with @anamarasovic.bsky.social & @lasha.bsky.social! 🎉

📍Findings Session 1 - Hall C
📅 Wed, November 5, 13:00 - 14:00

arxiv.org/abs/2505.22830

November 3, 2025 at 11:03 AM

Reposted by Alex Gill

Women in AI Research - WiAIR

@wiair.bsky.social

🧠 Can large language models build the very benchmarks used to evaluate them?
In “What Has Been Lost with Synthetic Evaluation”, Ana Marasović (@anamarasovic.bsky.social) and collaborators ask what happens when LLMs start generating the datasets used to test their reasoning. (1/6🧵)

October 20, 2025 at 4:01 PM

Alex Gill

@agill32.bsky.social

𝐖𝐡𝐚𝐭 𝐇𝐚𝐬 𝐁𝐞𝐞𝐧 𝐋𝐨𝐬𝐭 𝐖𝐢𝐭𝐡 𝐒𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧?

(arxiv.org/abs/2505.22830)

I'm happy to announce that the preprint release of my first project is online! Developed with the amazing support of @lasha.bsky.social & @anamarasovic.bsky.social

What Has Been Lost with Synthetic Evaluation?

Large language models (LLMs) are increasingly used for data generation. However, creating evaluation benchmarks raises the bar for this emerging paradigm. Benchmarks must target specific phenomena, pe...

arxiv.org

June 4, 2025 at 10:24 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news