Lightnews — Scholar-powered news

Oliver Daniels

@oadaniels.bsky.social

14 followers 140 following 9 posts

CS PhD student at UMass Amherst, AI safety stuff

Posts Replies Media Videos

Oliver Daniels

@oadaniels.bsky.social

Something like ability to influence and coordinate other agents around a desired aim

I agree that directly optimizing for this objective would be very hard

But political skill is clearly instrumental to many objectives that are easy to measure

June 19, 2025 at 6:08 AM

Oliver Daniels

@oadaniels.bsky.social

are you objecting to treating "politics" as a one-dimensional skill, or that AI could perform well at this skill (or both?)

obviously politics involves many sub-skills, but its still intelligible (and descriptively useful) to talk about humans as "good" or "bad" at politics

June 10, 2025 at 9:00 AM

Oliver Daniels

@oadaniels.bsky.social

Excited to see gradient routing applied to more realistic scalable oversight benchmarks, e.g. @eleutherai.bsky.social's recent work on "scalable elicitation" under different label budgets arxiv.org/abs/2410.13215

Balancing Label Quantity and Quality for Scalable Elicitation

Scalable oversight studies methods of training and evaluating AI systems in domains where human judgment is unreliable or expensive, such as scientific research and software engineering in complex cod...

arxiv.org

December 8, 2024 at 6:31 PM

Oliver Daniels

@oadaniels.bsky.social

Also the toy scalable oversight task is super cool, isolating the idea that oversight failures will be correlated with distributional shifts, and the most important distributional shift might be "is oversight being provided"

December 8, 2024 at 6:31 PM

Oliver Daniels

@oadaniels.bsky.social

Part of "their stuff" is economic populism, and ~everybody (including you) still thinks that can work

November 30, 2024 at 4:00 PM

Oliver Daniels

@oadaniels.bsky.social

But nah overall vox is great, if for the EA content alone

November 25, 2024 at 1:48 PM

Oliver Daniels

@oadaniels.bsky.social

Based

November 25, 2024 at 1:47 PM

Oliver Daniels

@oadaniels.bsky.social

When does Ezra not get it

November 22, 2024 at 7:37 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news