Geoffrey Irving
banner
girving.bsky.social
Geoffrey Irving
@girving.bsky.social
Chief Scientist at the UK AI Security Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc.
Pinned
Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk
The Alignment Project by AISI — The AI Security Institute
The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.
alignmentproject.aisi.gov.uk
"Nobody suspects the all-1 string."
January 29, 2026 at 2:09 PM
Symmetric block cyphers like AES and the cores of modern hash functions are roughly keyed, pseudorandom invertible functions. So a natural question is: if you pick a big enough nonlinear keyed invertible function at random, is it a secure block cipher? 🧵
January 17, 2026 at 11:26 PM
Unless... @qntm.org

"Very funny polling result where ~3/4 of people will say they read a book last year but if you ask them to name the book the share drops 20 points"

x.com/JosephPolita...
December 31, 2025 at 1:20 AM
New report on trends in AISI's evaluations of frontier AI models over the past two years. A lot of AI discourse focuses on viral moments, but it is important to zoom out to the less flashy trend: AI models are steadily growing in capabilities, including for dual-use.

www.aisi.gov.uk/frontier-ai-...
December 18, 2025 at 10:06 AM
I think I’m all right. Thank you for homeschooling me, Mom!

www.nytimes.com/2025/12/14/o...
Opinion | Home-Schooled Kids Are Not All Right
www.nytimes.com
December 14, 2025 at 6:46 PM
Lovely blog post version of a talk Scott Aaronson gave at the UK AISI Alignment Conference on theory and AI alignment. Thank you, Scott!

scottaaronson.blog?p=9333
Theory and AI Alignment
The following is based on a talk that I gave (remotely) at the UK AI Safety Institute Alignment Workshop on October 29, and which I then procrastinated for more than a month in writing up. Enjoy! T…
scottaaronson.blog
December 7, 2025 at 10:35 AM
A perk of being an American living in London who is from Alaska is that frequently when talking about temperatures I can refer to just "40 below" with no qualifiers.
November 28, 2025 at 10:32 AM
Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk
The Alignment Project by AISI — The AI Security Institute
The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.
alignmentproject.aisi.gov.uk
November 27, 2025 at 6:25 PM
The UK AI Security Institute ran an Alignment Conference from 29-31 November in London! The goal was to gather a mix of people experienced in and new to alignment, and get into the details of novel approaches to alignment and related problems. Hopefully we helped create some new research bets! 🧵
November 13, 2025 at 5:00 PM
Reposted by Geoffrey Irving
🚨New paper🚨

From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.

🧵🧵🧵
November 12, 2025 at 2:04 PM
There is a real chance that my most important positive contribution to the world will have been to say something wrong on the internet.
November 10, 2025 at 10:24 AM
The UK AISI Cyber Autonomous Systems Team is hiring propensity researchers to grow the science around whether models *are likely* to attempt dangerous behaviour, as opposed to whether they are capable of doing so. 🧵

job-boards.eu.greenhouse.io/aisi/jobs/47...
Research Scientist - CAST Propensity
London, UK
job-boards.eu.greenhouse.io
November 7, 2025 at 9:14 AM
Spooky:

import Batteries.Data.UInt

def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one
October 31, 2025 at 10:04 PM
Reposted by Geoffrey Irving
the time it would have taken me would probably have been of order of magnitude an hour (an estimate that comes with quite wide error bars). So it looks as though we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us. 3/3
October 31, 2025 at 7:25 PM
Reposted by Geoffrey Irving
I published a new post on my rarely updated personal blog! It's a sequel of sorts to my Quanta coverage of the Busy Beaver game, focusing on a particularly fearsome Turing machine known by the awesome name Antihydra.
Why Busy Beaver Hunters Fear the Antihydra
In which I explore the biggest barrier in the busy beaver game. What is Antihydra, what is the Collatz conjecture, how are they connected, and what makes them so daunting?
benbrubaker.com
October 27, 2025 at 4:04 PM
Another strong transition from @matt-levine.bsky.social.
October 23, 2025 at 7:59 PM
New AISI report mapping cruxes for whether AI progress might be fast or slow towards systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!

www.aisi.gov.uk/research/und...
Understanding AI Trajectories: Mapping the Limitations of Current AI Systems
www.aisi.gov.uk
October 23, 2025 at 3:17 PM
New open source library from the UK AI Security Institute! ControlArena lowers the barrier to secure and reproducible AI control research, to boost work on blocking and detecting malicious actions in case AI models are misaligned. In use by researchers at GDM, Anthropic, Redwood, and MATS! 🧵
October 22, 2025 at 6:04 PM
There's a nice recent post by @tobyord.bsky.social on the efficiency of pretraining vs. RL, arguing that RL can learn at most 1 bit per episode given binary reward. It's right that RL is less efficient, but 1 bit is not actually a limit in practice. 🧵 on why:

www.tobyord.com/writing/inef...
The Extreme Inefficiency of RL for Frontier Models — Toby Ord
The new scaling paradigm for AI reduces the amount of information a model could learn per hour of training by a factor of 1,000 to 1,000,000. I explore what this means and its implications for scaling...
www.tobyord.com
October 16, 2025 at 8:53 AM
Is there Matt Levine but for pure mathematics?
October 1, 2025 at 5:30 PM
Ominous start to a Wikipedia page about a formula...

en.wikipedia.org/wiki/Fa%C3%A...
September 29, 2025 at 9:02 PM
Reposted by Geoffrey Irving
Amongst the projects funded is my project www.renaissancephilanthropy.org/a-dataset-of... to create what in 2025 is a super-hard dataset of pairs (informal hard proof, formal statement) of recent results from top journals. The challenge for machine is to formalise the rest of the paper.
www.renaissancephilanthropy.org
September 18, 2025 at 8:25 AM
Cas is very good and you should hire him as faculty!
📌📌📌
I'm excited to be on the faculty job market this fall. I just updated my website with my CV.
stephencasper.com
Stephen Casper
Visit the post for more.
stephencasper.com
September 4, 2025 at 12:38 PM
From near the end of Sleepwalkers, by Christopher Clark, as World War I starts.
August 23, 2025 at 3:40 PM
Reposted by Geoffrey Irving
I'm honored to serve as Expert Advisor for "The Alignment Project", an international initiative dedicated to ensuring AI systems are safe and beneficial. They are providing significant funding, compute, and collaboration opportunities for researchers---including those in cogsci/neuro. Please apply!
I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk
The Alignment Project by AISI — The AI Security Institute
The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.
alignmentproject.aisi.gov.uk
August 20, 2025 at 5:54 PM