Lightnews — Scholar-powered news

@dpeskoff.bsky.social

6 followers 28 following 1 posts

Posts Replies Media Videos

Reposted

Nishant Balepur

@nbalepur.bsky.social

🚨 New Position Paper 🚨

Multiple choice evals for LLMs are simple and popular, but we know they are awful 😬

We complain they're full of errors, saturated, and test nothing meaningful, so why do we still use them? 🫠

Here's why MCQA evals are broken, and how to fix them 🧵

February 24, 2025 at 9:04 PM

dpeskoff.bsky.social

@dpeskoff.bsky.social

Hello, World!

April 5, 2025 at 3:21 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news