Lightnews — Scholar-powered news

@chiwilliams.bsky.social

1 followers 120 following 2 posts

Posts Replies Media Videos

chiwilliams.bsky.social

@chiwilliams.bsky.social

> It's quite easy to accidentally undo current AI alignment methods, e.g. by just training on some naughty numbers

I agree emergent misalignment was very important this year. But the headline was already known --- e.g., benign fine-tuning has always had the chance of removing safeguards right?

December 17, 2025 at 9:52 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news