Lightnews — Scholar-powered news

2) Verification is considerably harder than generation. Even when there are a few 100 of tokens, often it takes me several minutes to understand whether reasoning is OK or not

January 2, 2025 at 8:50 PM

Arthur Conmy

@arthurconmy.bsky.social

Still, there is no ground truth for interpretability, so progress is tough

December 10, 2024 at 7:45 PM

Arthur Conmy

@arthurconmy.bsky.social

Awesome research. The caveat is that humans were working for 8 hours, but they were explicitly encouraged to get results after 2 hours so I buy the claim.

November 22, 2024 at 10:22 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news