Dr Waku
drwaku.bsky.social
Dr Waku
@drwaku.bsky.social
YouTuber, AI research scientist, computer science PhD. I talk about how AI will affect all of us and society as a whole.
AI agents have a half-life for their success rates at completing tasks. Yes, the same type of half-life as in nuclear chemistry: a constant chance of task failure (= radioactive decay) in each time period... 1/10
June 27, 2025 at 1:01 AM
The US AI Safety Institute (AISI) is being renamed into the Center for AI Standards and Innovation (CAISI). It has a pretty similar sounding mandate. But I would like to point out that Canada had this name first with the Canadian AI Safety Institute (CAISI)! :O 1/2
June 5, 2025 at 2:22 PM
AI is not like other technologies. AI doesn't just make everything we were doing before faster and cheaper. It’s another, faster feedback loop of reasoning and intelligence beyond the human brain. It will change life as we know it. Here’s why. (1/10)
May 28, 2025 at 1:56 AM
This September, the book "If anyone builds it, everyone dies" comes out. It's about the existential risk from superintelligence, and it's by Eliezer Yudkowsky @esyudkowsky.bsky.social, one of the most well-known figures in AI safety. 1/3
May 16, 2025 at 2:10 PM
OpenAI released o4 and o3-mini today, and some safety testing was performed by the third party METR, who said:

"We detected several successful and unsuccessful attempts at “reward hacking” by o3.
1/7
April 17, 2025 at 12:53 AM
Today is the two year anniversary of my YouTube channel. 28.6K subscribers and 1.25M views. I think it's fair to say it's changed my life. Thank you to everyone who watches and participates in my community :)
April 5, 2025 at 11:07 PM
Today, OpenAI released PaperBench, an AI benchmark aimed at replicating cutting edge AI research papers from scratch. In this benchmark, the AI has to understand the paper, write code, and execute experiments. In other words, AI must become an (AI) research scientist.
April 2, 2025 at 10:18 PM
If anyone is interested in creating videos/posts/etc about AI safety, you can apply for funding from FLI:

futureoflife.org/pro...

Please contact me if you'd like to make long or short video content, I would be happy to collaborate or give tips!
Digital Media Accelerator - Future of Life Institute
The Digital Media Accelerator supports digital content from creators raising awareness and understanding about ongoing AI developments and issues.
futureoflife.org
March 28, 2025 at 8:00 PM
A few timelines about AI:
- algorithmic gains for models doubles every 8 months
- amount of compute used grows 10x every two years
- the length of tasks agents can perform is doubling every 7 months
- 50.8% of global VC funds are going to AI-focused companies
March 24, 2025 at 9:44 PM
Anthropic is using Amazon Trainium 2 chips to train their next models. It takes about three of these chips to match an H100 (at fp16) and four to match B200 (fp16). And Anthropic will have 400,000 of them at their disposal, from Amazon's Project Rainier.
March 22, 2025 at 10:38 AM
I heard an institution say, about a relatively important meeting, don't bring any devices with DeepSeek installed on them. Obviously worried about data exfiltration to China. There is such a technological separation growing alongside the ideological one.
March 21, 2025 at 2:38 AM
I believe that it is likely impossible to fully defend an LLM against jailbreaks. All instructions within the prompt exist at the same level, which means the precedence confusion between system/developer and user cannot be fully resolved. There is no way of enforcing rules.
March 19, 2025 at 2:12 AM
When Anthropic made a new alignment technique for Claude 3.7, called Constitutional Classifiers, they said 3,000 hours of jailbreaking effort had not been able to break the system. Then they created a public contest, and someone built a universal jailbreak within three weeks.
March 18, 2025 at 2:13 AM
AI has always had the problem of moving goalposts. When AI algorithms like A* and Djikstra's were invented, they were quickly considered to be "just search". When AI systems beat the best humans at chess, Jeopardy, and Go, it made news but quickly became the new normal.
March 16, 2025 at 6:20 AM
The term "AI safety" has several confusing meanings. I like to break it down into three categories, based on where malicious goals are coming from:

- Robustness: no malicious goals
- Misuse risk: human provides malicious goals
- Existential risk: AI provides malicious goals
March 15, 2025 at 1:46 AM
There's an AI security forum in Paris this February, one of the satellite events to the 3rd international AI summit. The event "will bring together ~150 experts in AI safety, policy, and cybersecurity to discuss securing powerful AI systems". Let me know if you're interested!
December 24, 2024 at 10:14 PM
Reposted by Dr Waku
Recently answered @anilananth.bsky.social's questions for Nature. No matter when it arrives, AGI and the road to reach it will both help tackle thorny problems (e.g. climate change and diseases), and pose huge risks. Understanding and transparency are key.
www.nature.com/articles/d41...
How close is AI to human-level intelligence?
Large language models such as OpenAI’s o1 have electrified the debate over achieving artificial general intelligence, or AGI. But they are unlikely to reach this milestone on their own.
www.nature.com
December 6, 2024 at 2:11 PM
Reposted by Dr Waku
Fascinating: In 2-hour sprints, AI agents outperform human experts at ML engineering tasks like optimizing GPU kernel. But humans pull ahead over longer periods - scoring 2x better at 32 hours. AI is faster but struggles with creative, long-term problem solving (for now?). metr.org/blog/2024-11...
November 23, 2024 at 8:20 PM
@robertwiblin.bsky.social Hi, I would love to be added to your EA starter pack. Also, would love to chat sometime about the 80k podcast!
November 25, 2024 at 1:20 AM
@ahappier.world hello from another YouTuber, I focus on how AI will impact society and AI safety. Love your thumbnails
November 25, 2024 at 1:18 AM
In-depth analysis of why it makes sense to concentrate on restricting compute, instead of say talent, for AI safety
arxiv.org/pdf/2402.08797
arxiv.org
November 25, 2024 at 1:09 AM
Reposted by Dr Waku
Made a lesswrong starterpack. reply if you use lesswrong and want to be put in it, or if there's anyone else I've not added that should be added!

go.bsky.app/EkcDcjA
November 19, 2024 at 7:59 PM
Bluesky needs a fail whale business.time.com/2013/11/06/h...
November 20, 2024 at 9:16 PM
I created a Manifund to raise money for my channel for 2025:
manifund.org/projects/240...
24,000 subscriber YouTube channel on AI safety | Dr Waku
Cover anticipated costs for making videos in 2025
manifund.org
November 20, 2024 at 8:57 PM
This is a great resource for what's happening in China around AI safety (thanks Circle Circle):

aisafetychina.substack.com
AI Safety in China | Concordia AI | Substack
Delivered to your inbox every two weeks. Click to read AI Safety in China, by Concordia AI, a Substack publication.
aisafetychina.substack.com
November 17, 2024 at 9:01 PM