Geoffrey Irving
banner
girving.bsky.social
Geoffrey Irving
@girving.bsky.social
Chief Scientist at the UK AI Security Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc.
Pinned
Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk
The Alignment Project by AISI — The AI Security Institute
The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.
alignmentproject.aisi.gov.uk
Being one of the two Deputy Directors of AISI's Research Unit is a very central and important role! Please apply if interested!

> This isn’t your average Civil Service job. For 9–12 months, you’ll co-lead one of the world’s most influential AI safety research organisations.

x.com/nateburnikel...
x.com
February 2, 2026 at 5:13 PM
One of the more useless things I did while at Google Brain was write down random access into xorshift128+, the hardware random number generator on TPUs. Purely a stunt: it could theoretically have meant TPU-native Jax-style random numbers faster than Threefry, but in practice random numbers are…
February 1, 2026 at 7:05 PM
An important thing to remember as AI develops is that, regardless of whether capabilities plateau or how far capabilities grow, computer science will still apply! AI won’t be magic, some computations will be intractable, P won’t be NP, etc. 🧵
January 31, 2026 at 5:42 PM
Achievement unlocked: trip to get passport photos as a family, but not all for the same country.
January 31, 2026 at 1:26 PM
"Nobody suspects the all-1 string."
January 29, 2026 at 2:09 PM
Symmetric block cyphers like AES and the cores of modern hash functions are roughly keyed, pseudorandom invertible functions. So a natural question is: if you pick a big enough nonlinear keyed invertible function at random, is it a secure block cipher? 🧵
January 17, 2026 at 11:26 PM
Unless... @qntm.org

"Very funny polling result where ~3/4 of people will say they read a book last year but if you ask them to name the book the share drops 20 points"

x.com/JosephPolita...
December 31, 2025 at 1:20 AM
New report on trends in AISI's evaluations of frontier AI models over the past two years. A lot of AI discourse focuses on viral moments, but it is important to zoom out to the less flashy trend: AI models are steadily growing in capabilities, including for dual-use.

www.aisi.gov.uk/frontier-ai-...
December 18, 2025 at 10:06 AM
I think I’m all right. Thank you for homeschooling me, Mom!

www.nytimes.com/2025/12/14/o...
Opinion | Home-Schooled Kids Are Not All Right
www.nytimes.com
December 14, 2025 at 6:46 PM
Lovely blog post version of a talk Scott Aaronson gave at the UK AISI Alignment Conference on theory and AI alignment. Thank you, Scott!

scottaaronson.blog?p=9333
Theory and AI Alignment
The following is based on a talk that I gave (remotely) at the UK AI Safety Institute Alignment Workshop on October 29, and which I then procrastinated for more than a month in writing up. Enjoy! T…
scottaaronson.blog
December 7, 2025 at 10:35 AM
A perk of being an American living in London who is from Alaska is that frequently when talking about temperatures I can refer to just "40 below" with no qualifiers.
November 28, 2025 at 10:32 AM
Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk
The Alignment Project by AISI — The AI Security Institute
The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.
alignmentproject.aisi.gov.uk
November 27, 2025 at 6:25 PM
The UK AI Security Institute ran an Alignment Conference from 29-31 November in London! The goal was to gather a mix of people experienced in and new to alignment, and get into the details of novel approaches to alignment and related problems. Hopefully we helped create some new research bets! 🧵
November 13, 2025 at 5:00 PM
Reposted by Geoffrey Irving
🚨New paper🚨

From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.

🧵🧵🧵
November 12, 2025 at 2:04 PM
There is a real chance that my most important positive contribution to the world will have been to say something wrong on the internet.
November 10, 2025 at 10:24 AM
The UK AISI Cyber Autonomous Systems Team is hiring propensity researchers to grow the science around whether models *are likely* to attempt dangerous behaviour, as opposed to whether they are capable of doing so. 🧵

job-boards.eu.greenhouse.io/aisi/jobs/47...
Research Scientist - CAST Propensity
London, UK
job-boards.eu.greenhouse.io
November 7, 2025 at 9:14 AM
Spooky:

import Batteries.Data.UInt

def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one
October 31, 2025 at 10:04 PM
Reposted by Geoffrey Irving
the time it would have taken me would probably have been of order of magnitude an hour (an estimate that comes with quite wide error bars). So it looks as though we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us. 3/3
October 31, 2025 at 7:25 PM
Reposted by Geoffrey Irving
I published a new post on my rarely updated personal blog! It's a sequel of sorts to my Quanta coverage of the Busy Beaver game, focusing on a particularly fearsome Turing machine known by the awesome name Antihydra.
Why Busy Beaver Hunters Fear the Antihydra
In which I explore the biggest barrier in the busy beaver game. What is Antihydra, what is the Collatz conjecture, how are they connected, and what makes them so daunting?
benbrubaker.com
October 27, 2025 at 4:04 PM
Another strong transition from @matt-levine.bsky.social.
October 23, 2025 at 7:59 PM
New AISI report mapping cruxes for whether AI progress might be fast or slow towards systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!

www.aisi.gov.uk/research/und...
Understanding AI Trajectories: Mapping the Limitations of Current AI Systems
www.aisi.gov.uk
October 23, 2025 at 3:17 PM
New open source library from the UK AI Security Institute! ControlArena lowers the barrier to secure and reproducible AI control research, to boost work on blocking and detecting malicious actions in case AI models are misaligned. In use by researchers at GDM, Anthropic, Redwood, and MATS! 🧵
October 22, 2025 at 6:04 PM
There's a nice recent post by @tobyord.bsky.social on the efficiency of pretraining vs. RL, arguing that RL can learn at most 1 bit per episode given binary reward. It's right that RL is less efficient, but 1 bit is not actually a limit in practice. 🧵 on why:

www.tobyord.com/writing/inef...
The Extreme Inefficiency of RL for Frontier Models — Toby Ord
The new scaling paradigm for AI reduces the amount of information a model could learn per hour of training by a factor of 1,000 to 1,000,000. I explore what this means and its implications for scaling...
www.tobyord.com
October 16, 2025 at 8:53 AM
Is there Matt Levine but for pure mathematics?
October 1, 2025 at 5:30 PM
Ominous start to a Wikipedia page about a formula...

en.wikipedia.org/wiki/Fa%C3%A...
September 29, 2025 at 9:02 PM