Lightnews — Scholar-powered news

Ryan Greenblatt

@ryangreenblatt.bsky.social

Chief scientist at Redwood Research (https://www.redwoodresearch.org/), focused on technical AI safety research to reduce risks from rogue AIs

Posts Replies Media Videos

Ryan Greenblatt

@ryangreenblatt.bsky.social

Anthropic has (relatively) official AGI timelines: powerful AI by early 2027. I think this prediction is unlikely to come true and I explain why in a new post.

I also give a proposed timeline with powerful AI in early 2027 so we can (hopefully) update before it is too late.

What's up with Anthropic predicting AGI by early 2027? — LessWrong

As far as I'm aware, Anthropic is the only AI company with official AGI timelines[1]: they expect AGI by early 2027. In their recommendations (from M…

www.lesswrong.com

November 3, 2025 at 5:25 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

Dario has recently been claiming that his prediction of AIs writing 90% of code in 3-6 months has come true.

I'm skeptical, though I agree that AIs are writing a high fraction of code at Anthropic.

Is 90% of code at Anthropic being written by AIs?

I'm skeptical that Dario's prediction of AIs writing 90% of code in 3-6 months has come true

blog.redwoodresearch.org

October 22, 2025 at 5:20 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

Studying actual scheming AIs might be difficult (as they don't want to be studied!). Can we instead just study AIs trained to exhibit misaligned/scheming behavior?

I discuss how promising this is and how we might do this in a new post: www.lesswrong.com/posts/v6K3hn...

www.lesswrong.com

October 16, 2025 at 5:06 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

Anthropic, GDM, and xAI say nothing about whether they train against Chain-of-Thought (CoT) while OpenAI claims they don't.

AI companies should be transparent about whether (and how) they train against CoT. While OpenAI is doing better, all AI companies should say more. 1/

October 10, 2025 at 4:31 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

Studying coherent scheming in AIs seems tricky, but there might be a feedback loop where we make schemers to study by iterating against detection methods and then improve detectors using these schemers. Iteration could start on weak AIs and transfer to stronger AIs.

Iterated Development and Study of Schemers (IDSS)

A strategy for handling scheming

blog.redwoodresearch.org

October 10, 2025 at 4:11 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

What we should do to mitigate misalignment risk depends a lot on the level of buy-in/political will (from AI companies, US government, and China).

I've found it helpful to separate this out into Plan A/B/C/D and to plan for these situations somewhat separately.

I say more in a new post:

Plans A, B, C, and D for misalignment risk

Different plans for different levels of political will

blog.redwoodresearch.org

October 8, 2025 at 6:45 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

I now think very short AGI timelines are less likely. I updated due to GPT-5 being slightly below trend and not seeing fast progress in 2025.

At the start of 2025, I thought full automation of AI R&D before 2029 was ~25% likely, now I think it's only ~15% likely.

My AGI timeline updates from GPT-5 (and 2025 so far)

AGI before 2029 now seems substantially less likely

blog.redwoodresearch.org

October 7, 2025 at 7:31 PM

Ryan Greenblatt

@ryangreenblatt.bsky.social

I'm now on bluesky. You can see my X/twitter account here: x.com/RyanPGreenbl.... I post about AGI timelines, takeoff, and misalignment risk.

My bluesky account will cross post my posts from X/twitter starting with some of my historical posts that people might be interested in.

October 7, 2025 at 7:22 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news