Cas (Stephen Casper)
banner
scasper.bsky.social
Cas (Stephen Casper)
@scasper.bsky.social
AI technical gov & risk management research. PhD student @MIT_CSAIL, fmr. UK AISI. I'm on the CS faculty job market! https://stephencasper.com/
Just as building the science of open-weight model risk management will provide a collective good, it will also require collective effort.
November 12, 2025 at 2:17 PM
We also find that currently, prominent open-weight model developers often either do not implement or report on mitigations. So there is a lot of room for more innovation and information as the science grows.
November 12, 2025 at 2:17 PM
In response, we cover 16 open technical problems with *unique* implications for open-weight model safety. They span the model lifecycle across training data curation, training algorithms, evaluations, deployment, and ecosystem monitoring.
x.com/StephenLCasp...
Cas (Stephen Casper) on X: "In response, we cover 16 open technical problems with *unique* implications for open-weight model safety. They span the model lifecycle across training data curation, training algorithms, evaluations, deployment, and ecosystem monitoring. https://t.co/4WQggZR3wS" / X
In response, we cover 16 open technical problems with *unique* implications for open-weight model safety. They span the model lifecycle across training data curation, training algorithms, evaluations, deployment, and ecosystem monitoring. https://t.co/4WQggZR3wS
x.com
November 12, 2025 at 2:17 PM
Taking AI safety seriously increasingly means taking open-weight models seriously.
November 12, 2025 at 2:17 PM
Empirical harms enabled by open models are also mounting. For example, the Internet Watch Foundation has found that they are the tools of choice for generating non-consensual AI deepfakes depicting children.
t.co/Ag4J6rrejz
November 12, 2025 at 2:17 PM
Most importantly, powerful open-weight models are probably inevitable. For example, in recent years, they have steadily grown in their prominence, capabilities, and influence. Here are two nice graphics I often point to.

Thx Epoch & Bhandari et al.
November 12, 2025 at 2:17 PM
Compared to proprietary models, open-weight models pose different opportunities and problems. I often say that they are simultaneously wonderful and terrible. For example, they allow for more open research and testing, but they can also be arbitrarily tampered with.
November 12, 2025 at 2:17 PM
Here's the paper:
t.co/CVkAKNXZme
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5705186
t.co
November 12, 2025 at 2:17 PM
@agstrait
@_robertkirk
@DanHendrycks
@PeterHndrsn
@zicokolter
@geoffreyirving
@yaringal
@Yoshua_Bengio
@dhadfieldmenell
@ai_risks
@AISecurityInst
November 12, 2025 at 2:04 PM
It was great working on this paper with some of the smartest people I know.

Kyle O'Brien (@kyletokens.bsky.social)
@ShayneRedford
@ea_seger
@kevin_klyman
@RishiBommasani
@ani_nlp
@iliaishacked
@sorenmind
@xksteven
@SPOClab
@KellinPelrine
@evijitghosh
...
November 12, 2025 at 2:04 PM
Speaking of which, @ai_risks is giving away $350k in research grants for open-weight model safety. The RFP is open until December 7.

docs.google.com/forms/d/e/1...
Open-Weight Safeguards Proposal
docs.google.com
November 12, 2025 at 2:04 PM
Just as building the science of open-weight model risk management will provide a collective good, it will also require collective effort.
November 12, 2025 at 2:04 PM
We also find that currently, prominent open-weight model developers often either do not implement or report on mitigations. So there is a lot of room for more innovation and information as the science grows.
November 12, 2025 at 2:04 PM
In response, we cover 16 open technical problems with *unique* implications for open-weight model safety. They span the model lifecycle across training data curation, training algorithms, evaluations, deployment, and ecosystem monitoring.
November 12, 2025 at 2:04 PM
Taking AI safety seriously increasingly means taking open-weight models seriously.
November 12, 2025 at 2:04 PM
Empirical harms enabled by open models are also mounting. For example, the Internet Watch Foundation has found that they are the tools of choice for generating non-consensual AI deepfakes depicting children.

admin.iwf.org.uk/media/nadlc...
November 12, 2025 at 2:04 PM
Most importantly, powerful open-weight models are probably inevitable. For example, in recent years, they have steadily grown in their prominence, capabilities, and influence. Here are two nice graphics I often point to.

Thx @EpochAIResearch & Bhandari et al.
November 12, 2025 at 2:04 PM
Compared to proprietary models, open-weight models pose different opportunities and problems. I often say that they are simultaneously wonderful and terrible. For example, they allow for more open research and testing, but they can also be arbitrarily tampered with.
November 12, 2025 at 2:04 PM
November 12, 2025 at 2:04 PM
...although Mark Twain might have some thoughts on those damned usage statistics.
November 8, 2025 at 7:42 PM