Luca Righetti
lucarighetti.bsky.social
Luca Righetti
@lucarighetti.bsky.social
Research Open_Phil, co-host HearThisIdea. Views my own.
🔸10% Pledge at GivingWhatWeCan.
I swear every research org has struggled with: "How do we share more WIP without people treating it as final?"

Love how clicking @METR_Evals's new Notes page changes the whole site to handwritten font and chalk background.

Strong visual screaming "no seriously, this is rough".
October 18, 2025 at 1:01 AM
How can we verify that AI ChemBio safety tests were properly run?

Today we're launching STREAM: a checklist for more transparent eval results.

I read a lot of model reports. Often they miss important details, like human baselines. STREAM helps make peer review more systematic.
September 2, 2025 at 4:03 PM
I've been procrastinating on this chart of all model card releases by OpenAI, GDM, and Anthropic:
• 4 cases of late safety results (out of 27, so ~15%)
• Notably 2 cases were late results showed increases in risk
• The most recent set of releases in August were all on time
x.com/HarryBooth5...
August 29, 2025 at 5:55 PM
How concerned should we be about AIxBio? We surveyed 46 bio experts and 22 superforecasters:

If LLMs do very well on a virology eval, human-caused epidemics could increase 2-5x.

Most thought this was >5yrs away. In fact, the threshold was hit just *months* after the survey. 🧵
July 1, 2025 at 3:09 PM
Three weeks ago a car bomb exploded outside an IVF clinic in California, injuring four people.

Now court documents against his accomplice show the terrorist asked AI to help build the bomb.

A thread on what I think those documents do and don't show 🧵…
x.com/CNBC/status...
June 9, 2025 at 9:32 AM
OpenAI and Anthropic *both* warn there's a sig. chance that their next models might hit ChemBio risk thresholds -- and are investing in safeguards to prepare.

Kudos to OpenAI for consistently publishing these eval results, and great to see Anthropic now sharing a lot more too.
February 26, 2025 at 12:49 AM
Bizzare that a monkey can cause >10X the blackout damage of Russian hackers
February 17, 2025 at 8:24 PM
A few weeks ago, I “peer-reviewed” o1-preview's ChemBio safety card and highlighted some issues about its methodology.

Now that o1 is out, how does it stack up?

Better! (Though there’s still room for improvement.)

Here’s my new o1 scorecard. 🧵👇
December 10, 2024 at 7:57 PM
Most climate deaths will occur in developing countries, especially in slow-growth scenarios where adaptation is unaffordable.

Framing climate change as an inequality problem —not an extinction risk— highlights the need for global aid, LMIC growth, and valuing all lives equally.
December 2, 2024 at 8:16 PM