Anton Leicht
antonleicht.bsky.social
Anton Leicht
@antonleicht.bsky.social
Politics & Frontier AI Policy | PhD Candidate AI Agent Alignment

antonleicht.me
So what to do? From the political POV, I think:
• More separation between eval technical work and policy advocacy
• Focus on propensity over capability
• Clearer articulation of implications for deployment
• Greater focus on frameworks, fewer evals for evals' sake

(11/N)
December 17, 2024 at 12:00 PM
And if current evals had implications for deployment (I don't think so), this should be made clear, and they should not be used for developer PR and frog-boiling.

As long as concerning evals lead to getting used to inaction, more evals are not as politically helpful. (10/N)
December 17, 2024 at 12:00 PM
As one exhibit, see Apollo's CEO Marius Hobbhahn defending results against both ducks and rabbits simultaenously: (8/N)
December 17, 2024 at 12:00 PM
3️⃣ The 'rabbit-duck-problem'. Exactly the same eval results simultaneously allow for catastrophization and dismissal.

Is this evidence for impending doom, or evidence that someone asked a robot to pretend to be evil? Turns out, both. (7/N)
December 17, 2024 at 12:00 PM
@sebk.bsky.social puts the source of the resulting confusion very succinctly: (6/N)
December 17, 2024 at 12:00 PM
2️⃣ Incentive structures favor dramatic capability findings over propensity assessments. Eval orgs, media, and developers are all incentivized to amplify 'scary' capability results. It makes for good, but confusing, headlines. (5/N)
December 17, 2024 at 12:00 PM
1️⃣ Entanglements. Many eval orgs are funded by the same sources that fund policy orgs pushing for evals. This is easily exploitable by opponents.

This already started to show in the SB-1047 discussion: (4/N)
December 17, 2024 at 12:00 PM
New Post: The AI Eval Political Economy is in trouble.

Evals are crucial for safety-focused AI policy - but four structural problems threaten their future effectiveness as policy instruments. 🧵
December 17, 2024 at 12:00 PM
This strategy is well-precedented. In the past, threats of delays have often been rolled back again fairly quickly, like in numerous conflicts between the EU and Meta or Apple.

It's a little disheartening to see usually cooler heads and elsewhere jump on the bait. (10/N)
December 10, 2024 at 6:42 PM
Delaying Sora is good lobbying strategy:

First, it focuses latent frustration on uncomfortable policy, e.g. on AIA implementation through the Code of Practice and nat. laws.

Through delays, OpenAI is leveraging public & business pressure to weaken this implementation. (8/N)
December 10, 2024 at 6:42 PM
‘Sora's delay is bad because it means some models won't be available in Europe at all'

I just don't think there is good evidence for that takeaway. If major models actually skip Europe, I'll be very concerned.

But currently, statements like
the below have better alternative explanations. (7/N)
December 10, 2024 at 6:42 PM
'Sora's delay is bad because it means more future delays'

Compounding costs from further delays are very concerning. I think they're a little less likely than reported: OpenAI has delayed Sora, but not o1, a similarly regulation-relevant model (s. the scorecard). (5.5/N)
December 10, 2024 at 6:42 PM
But assuming the regulatory landscape will stay as it is, how could the delay of Sora mean that 'Europe is so fucked' - why is it bad for the EU economy?

Three options: Sora delay is costly, future delays will be costly, future skipped releases will be costly. (4/N)
December 10, 2024 at 6:42 PM
Delays follow from an asymmetry: UK and EU are regulated, many other markets aren't.

But that could change: Many other markets (e.g. US states - map below from June 2024) might also regulate soon. Once they reach critical mass, local delays wouldn't be realistic anymore. (3/N)
December 10, 2024 at 6:42 PM
Why the delay?

No official info, but it's unlikely that this is (only) about the EU AI Act: It's also delayed in the UK, which has no AIA. More likely to be about the ~similar EU Digital Markets Act and UK Online Safety Act.

The Guardian has a similar interpretation: (1/N)
December 10, 2024 at 6:42 PM
A ray of hope: Institutions like the AI Safety Institutes are genuinely effective and can be politically valuable down the road.

But as they get associated with unpopular policy, they’re at risk, too - like with Sen Cruz' attacks on NIST. (5/N)
December 8, 2024 at 8:58 AM
Looking at the output, there are some genuine wins, some high-profile failures and some precarious achievements. (3/N)
December 8, 2024 at 8:58 AM