Lightnews — Scholar-powered news

@vkrakovna.bsky.social

Research scientist in AI alignment at Google DeepMind. Co-founder of Future of Life Institute. Views are my own and do not represent GDM or FLI.

Posts Replies Media Videos

vkrakovna.bsky.social

@vkrakovna.bsky.social

As models advance, a key AI safety concern is deceptive alignment / "scheming" – where AI might covertly pursue unintended goals. Our paper "Evaluating Frontier Models for Stealth and Situational Awareness" assesses whether current models can scheme. arxiv.org/abs/2505.01420

July 8, 2025 at 12:11 PM

Reposted

David Lindner

@davidlindner.bsky.social

Super excited this giant paper outlining our technical approach to AGI safety and security is finally out!

No time to read 145 pages? Check out the 10 page extended abstract at the beginning of the paper

April 4, 2025 at 2:27 PM

vkrakovna.bsky.social

@vkrakovna.bsky.social

We are excited to release a short course on AGI safety! The course offers a concise and accessible introduction to AI alignment problems and our technical / governance approaches, consisting of short recorded talks and exercises (75 minutes total). deepmindsafetyresearch.medium.com/1072adb7912c

Introducing our short course on AGI safety

We are excited to release a short course on AGI safety for students, researchers and professionals interested in this topic. The course…

deepmindsafetyresearch.medium.com

February 14, 2025 at 4:06 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news