Lightnews — Scholar-powered news

Geoffrey Irving

@girving.bsky.social

3.8K followers 120 following 510 posts

Chief Scientist at the UK AI Security Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc.

Posts Replies Media Videos

Pinned

Geoffrey Irving @girving.bsky.social · Nov 27

Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵

Geoffrey Irving @girving.bsky.social · Jul 30

I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk

The Alignment Project by AISI — The AI Security Institute

The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.

alignmentproject.aisi.gov.uk

Geoffrey Irving

@girving.bsky.social

Being one of the two Deputy Directors of AISI's Research Unit is a very central and important role! Please apply if interested!

> This isn’t your average Civil Service job. For 9–12 months, you’ll co-lead one of the world’s most influential AI safety research organisations.

x.com/nateburnikel...

x.com

February 2, 2026 at 5:13 PM

Geoffrey Irving

@girving.bsky.social

One of the more useless things I did while at Google Brain was write down random access into xorshift128+, the hardware random number generator on TPUs. Purely a stunt: it could theoretically have meant TPU-native Jax-style random numbers faster than Threefry, but in practice random numbers are…

February 1, 2026 at 7:05 PM

Geoffrey Irving

@girving.bsky.social

An important thing to remember as AI develops is that, regardless of whether capabilities plateau or how far capabilities grow, computer science will still apply! AI won’t be magic, some computations will be intractable, P won’t be NP, etc. 🧵

January 31, 2026 at 5:42 PM

Geoffrey Irving

@girving.bsky.social

Achievement unlocked: trip to get passport photos as a family, but not all for the same country.

January 31, 2026 at 1:26 PM

Geoffrey Irving

@girving.bsky.social

"Nobody suspects the all-1 string."

January 29, 2026 at 2:09 PM

Geoffrey Irving

@girving.bsky.social

Symmetric block cyphers like AES and the cores of modern hash functions are roughly keyed, pseudorandom invertible functions. So a natural question is: if you pick a big enough nonlinear keyed invertible function at random, is it a secure block cipher? 🧵

January 17, 2026 at 11:26 PM

Geoffrey Irving

@girving.bsky.social

Unless... @qntm.org

"Very funny polling result where ~3/4 of people will say they read a book last year but if you ask them to name the book the share drops 20 points"

x.com/JosephPolita...

December 31, 2025 at 1:20 AM

Geoffrey Irving

@girving.bsky.social

New report on trends in AISI's evaluations of frontier AI models over the past two years. A lot of AI discourse focuses on viral moments, but it is important to zoom out to the less flashy trend: AI models are steadily growing in capabilities, including for dual-use.

www.aisi.gov.uk/frontier-ai-...

December 18, 2025 at 10:06 AM

Geoffrey Irving

@girving.bsky.social

I think I’m all right. Thank you for homeschooling me, Mom!

www.nytimes.com/2025/12/14/o...

Opinion | Home-Schooled Kids Are Not All Right

www.nytimes.com

December 14, 2025 at 6:46 PM

Geoffrey Irving

@girving.bsky.social

Lovely blog post version of a talk Scott Aaronson gave at the UK AISI Alignment Conference on theory and AI alignment. Thank you, Scott!

scottaaronson.blog?p=9333

Theory and AI Alignment

The following is based on a talk that I gave (remotely) at the UK AI Safety Institute Alignment Workshop on October 29, and which I then procrastinated for more than a month in writing up. Enjoy! T…

scottaaronson.blog

December 7, 2025 at 10:35 AM

Geoffrey Irving

@girving.bsky.social

A perk of being an American living in London who is from Alaska is that frequently when talking about temperatures I can refer to just "40 below" with no qualifiers.

November 28, 2025 at 10:32 AM

Geoffrey Irving

@girving.bsky.social

Geoffrey Irving @girving.bsky.social · Jul 30

The Alignment Project by AISI — The AI Security Institute

The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.

alignmentproject.aisi.gov.uk

November 27, 2025 at 6:25 PM

Geoffrey Irving

@girving.bsky.social

The UK AI Security Institute ran an Alignment Conference from 29-31 November in London! The goal was to gather a mix of people experienced in and new to alignment, and get into the details of novel approaches to alignment and related problems. Hopefully we helped create some new research bets! 🧵

November 13, 2025 at 5:00 PM

Reposted by Geoffrey Irving

Cas (Stephen Casper)

@scasper.bsky.social

🚨New paper🚨

From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.

🧵🧵🧵

November 12, 2025 at 2:04 PM

Geoffrey Irving

@girving.bsky.social

There is a real chance that my most important positive contribution to the world will have been to say something wrong on the internet.

November 10, 2025 at 10:24 AM

Geoffrey Irving

@girving.bsky.social

The UK AISI Cyber Autonomous Systems Team is hiring propensity researchers to grow the science around whether models *are likely* to attempt dangerous behaviour, as opposed to whether they are capable of doing so. 🧵

job-boards.eu.greenhouse.io/aisi/jobs/47...

Research Scientist - CAST Propensity

London, UK

job-boards.eu.greenhouse.io

November 7, 2025 at 9:14 AM

Geoffrey Irving

@girving.bsky.social

Spooky:

import Batteries.Data.UInt

def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one

October 31, 2025 at 10:04 PM

Reposted by Geoffrey Irving

Timothy Gowers

@wtgowers.bsky.social

the time it would have taken me would probably have been of order of magnitude an hour (an estimate that comes with quite wide error bars). So it looks as though we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us. 3/3

October 31, 2025 at 7:25 PM

Reposted by Geoffrey Irving

Ben Brubaker

@benbenbrubaker.bsky.social

I published a new post on my rarely updated personal blog! It's a sequel of sorts to my Quanta coverage of the Busy Beaver game, focusing on a particularly fearsome Turing machine known by the awesome name Antihydra.

Why Busy Beaver Hunters Fear the Antihydra

In which I explore the biggest barrier in the busy beaver game. What is Antihydra, what is the Collatz conjecture, how are they connected, and what makes them so daunting?

benbrubaker.com

October 27, 2025 at 4:04 PM

Geoffrey Irving

@girving.bsky.social

Another strong transition from @matt-levine.bsky.social.

October 23, 2025 at 7:59 PM

Geoffrey Irving

@girving.bsky.social

New AISI report mapping cruxes for whether AI progress might be fast or slow towards systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!

www.aisi.gov.uk/research/und...

Understanding AI Trajectories: Mapping the Limitations of Current AI Systems

www.aisi.gov.uk

October 23, 2025 at 3:17 PM

Geoffrey Irving

@girving.bsky.social

New open source library from the UK AI Security Institute! ControlArena lowers the barrier to secure and reproducible AI control research, to boost work on blocking and detecting malicious actions in case AI models are misaligned. In use by researchers at GDM, Anthropic, Redwood, and MATS! 🧵

October 22, 2025 at 6:04 PM

Geoffrey Irving

@girving.bsky.social

There's a nice recent post by @tobyord.bsky.social on the efficiency of pretraining vs. RL, arguing that RL can learn at most 1 bit per episode given binary reward. It's right that RL is less efficient, but 1 bit is not actually a limit in practice. 🧵 on why:

www.tobyord.com/writing/inef...