Lightnews — Scholar-powered news

Cas (Stephen Casper)

@scasper.bsky.social

200 followers 190 following 270 posts

AI technical gov & risk management research. PhD student @MIT_CSAIL, fmr. UK AISI. I'm on the CS faculty job market! https://stephencasper.com/

Posts Replies Media Videos

Pinned

Cas (Stephen Casper) @scasper.bsky.social · Sep 4

📌📌📌
I'm excited to be on the faculty job market this fall. I just updated my website with my CV.
stephencasper.com

Stephen Casper

Visit the post for more.

stephencasper.com

Cas (Stephen Casper)

@scasper.bsky.social

On the @csis.org podcast with Greg Allen and Stephen Clare, we talk about the International AI Safety Report, technical safeguards, and what engineers can (and can't!) do to save us from AI risks.

www.youtube.com/watch?v=2VlX...

Inside The Second Int'l AI Safety Report with Stephen Clare & Stephen Casper | The AI Policy Podcast

YouTube video by Center for Strategic & International Studies

www.youtube.com

February 10, 2026 at 3:18 PM

Cas (Stephen Casper)

@scasper.bsky.social

Personally, I wish Anthropic would go a step further and also mention at the end of their ads that they aren't currently embroiled in multiple lawsuits over the deaths of children.

🧵🧵🧵 What I like about the new Claude Ads

www.theverge.com/ai-artificia...

Anthropic says ‘Claude will remain ad-free,’ unlike ChatGPT

‘Ads are coming to AI. But not to Claude.’

www.theverge.com

February 5, 2026 at 5:22 AM

Cas (Stephen Casper)

@scasper.bsky.social

The 2026 International AI Safety Report has 221 pages of cutting-edge research from a team of 36 writers and over 100 contributors referencing 1451 citations...

...And also this cheeky riddle that I wrote on page 23 (original content).

internationalaisafetyreport.org/publication/...

February 3, 2026 at 2:48 PM

Cas (Stephen Casper)

@scasper.bsky.social

The IAISR is one of a kind. Every paragraph has undergone many rounds of scrutiny from dozens of experts and stakeholders over the course of months.

I'm thankful for the rest of the writing team. If you're interested, my work this year was mostly in sections 1.1 and 3.3.

Yoshua Bengio @yoshuabengio.bsky.social · 11d

Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵

(1/19)

February 3, 2026 at 2:46 PM

Cas (Stephen Casper)

@scasper.bsky.social

In a few years, I think it would be cool to do a project and write a paper on how prominent AI systems from US companies respond to political questions before vs. after the next presidential transition.

February 1, 2026 at 3:00 PM

Cas (Stephen Casper)

@scasper.bsky.social

Turns out, there are a TON of image/video AI models hosted on CivitAI with dogwhistles for NCII and/or CSAM in their names. 👀

Max Kamachee and I just updated our "Video Deepfake Abuse" paper with this new fig:

🔗 papers.ssrn.com/sol3/papers....

January 30, 2026 at 9:43 PM

Reposted by Cas (Stephen Casper)

FAR.AI

@far.ai

Open-weight model safety is AI safety in hard mode. Anyone can modify every parameter. @scasper.bsky.social: Open-weight models are only months behind closed models, which are reaching dangerous capability thresholds. 2026 will be critical.👇

January 29, 2026 at 4:32 PM

Cas (Stephen Casper)

@scasper.bsky.social

This is not a new report (it's from last summer). But it's now finally available on SSRN, more accessibly than before. Great working with Claire Short on this.

papers.ssrn.com/sol3/papers....

January 27, 2026 at 12:21 PM

Cas (Stephen Casper)

@scasper.bsky.social

I made a fully-open, living document with notes and concrete project ideas about tamper-resistance and open-weight model safety research.

You, yes you 🫵, should feel free to look, comment, or message me about it.

docs.google.com/document/d/1...

https://docs.google.com/document/d/10XkZpUabt4fEK8BUtd8Jz26-M8ARQ6c5iJCbefaUtQI/edit?usp=sharing

t.co

January 23, 2026 at 6:28 PM

Cas (Stephen Casper)

@scasper.bsky.social

Here are some miscellaneous title ideas for papers that I'm not currently working on, but sometimes daydream about. Let me know if you are thinking about anything related.

January 22, 2026 at 4:55 PM

Cas (Stephen Casper)

@scasper.bsky.social

Research on tamper-resistant machine unlearning is funny.

The SOTA, according to papers proposing techniques, is resistance to tens of thousands of adversarial fine-tuning steps.

But according to papers that do second-party red-teaming, the SOTA is just a couple hundred steps.

January 22, 2026 at 2:00 PM

Cas (Stephen Casper)

@scasper.bsky.social

To people working on adversarial vulnerabilities for safeguards against AI deepfake porn, I'm glad you're doing what you're doing. But don't forget that mitigations matter, & we're not always up against sophisticated attacks. Half the time, the perpetrators are literal teenagers.

January 13, 2026 at 2:02 PM

Cas (Stephen Casper)

@scasper.bsky.social

🧵 Non-consensual AI deepfakes are out of control. But the 1st Amendment will likely prevent the US from directly prohibiting models/apps that make producing personalized NCII trivial.

In this thread, I'll explain the problem and a 1st Amendment-compatible solution (I think).

January 12, 2026 at 7:30 PM

Cas (Stephen Casper)

@scasper.bsky.social

One example of how easily harmful derivatives of open-weight models proliferate can be found on Hugging Face. Search "uncensored" or "abliterated" in the model search bar. You'll find some 7k models fine-tuned specifically to remove safeguards.

January 10, 2026 at 2:00 PM

Cas (Stephen Casper)

@scasper.bsky.social

🚨 New paper from an awesome group led by Noam Kolt and
@nickacaputo
.

We hear a lot about what important concepts and methods from AI research that lawyers need to understand. But it's really a two-way street...

🧵🧵🧵

January 8, 2026 at 10:40 PM

Reposted by Cas (Stephen Casper)

JHU Computer Science

@jhucompsci.bsky.social

Join us for our first CS seminar of the year, featuring @scasper.bsky.social! Learn more about his upcoming talk here: www.cs.jhu.edu/event/cs-sem... and check out other upcoming seminars here: www.cs.jhu.edu/department-s...

Computer Science Seminar Series: Making Robust AI Safeguards Run Deep. January 15, 2026, 228 Malone Hall. Refreshments available 12:15 p.m. Seminar begins 12:30 p.m. Stephen Casper, Massachusetts Institute of Technology.

January 8, 2026 at 2:24 PM

Cas (Stephen Casper)

@scasper.bsky.social

🧵Thanks in part to recent attention on Grok's widespread undressing, emerging consciousness around AI nudification apps is sparking discussions on making AI undressing apps illegal. Minnesota and the UK are currently actively considering laws that would do this.

January 7, 2026 at 12:00 AM

Cas (Stephen Casper)

@scasper.bsky.social

Given the current Grok deepfake snafu on Twitter this week, I'll leave this here. We put it online a month ago.
t.co/3qWCNzoZrh

January 5, 2026 at 6:13 PM

Cas (Stephen Casper)

@scasper.bsky.social

I think these are my 4 favorite papers of 2025.

December 30, 2025 at 10:57 PM

Cas (Stephen Casper)

@scasper.bsky.social

With, e.g., OpenAI planning over 1T in commitments in the next few years, it increasingly seems that one of two bad things will inevitably happen: a bubble bursting or the concentration of obscene levels of power in tech. I don't see how this ends well.

techcrunch.com/2025/11/06/...

Sam Altman says OpenAI has $20B ARR and about $1.4 trillion in data center commitments | TechCrunch

Altman named a long list of upcoming business he thinks will generate significant revenue.

techcrunch.com

December 19, 2025 at 3:56 PM

Cas (Stephen Casper)

@scasper.bsky.social

Taking AI safety seriously means taking open-weight model safety seriously. Unfortunately, the AI safety field has historically mostly worked with closed models in mind. Here, I explain how we can meet new challenges from open models.

www.youtube.com/watch?v=VWk3...

Stephen Casper - Powerful Open-Weight AI Models: Wonderful, Terrible & Inevitable [Alignment Worksho

YouTube video by FAR․AI

www.youtube.com

December 18, 2025 at 5:04 PM

Cas (Stephen Casper)

@scasper.bsky.social

🧵🧵🧵 In the past few months, I have looked at hundreds, maybe thousands, of AI porn images/videos (for science).

Here's what I learned from our investigation of over 50 platforms, sites, apps, Discords, etc., while writing this paper.

papers.ssrn.com/sol3/papers...

December 15, 2025 at 2:59 PM

Cas (Stephen Casper)

@scasper.bsky.social

🧵 I think people often assume that AI images/video will get harder to distinguish from natural ones over time with better models.

In most (non-adversarial) cases, I expect the opposite will often apply...

December 12, 2025 at 5:00 PM

Cas (Stephen Casper)

@scasper.bsky.social

Excited that our paper has been on SSRN for 8 days, but became SSRN's most downloaded paper of the past 60 days in two ejournal categories. Glad about this -- I think this is one of the more important projects I've worked on.

papers.ssrn.com/sol3/papers....

December 11, 2025 at 7:05 PM

Cas (Stephen Casper)

@scasper.bsky.social

UK AISI is hiring for a technical research role on open-weight model safeguards.

www.aisi.gov.uk/careers

December 11, 2025 at 2:00 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news