Karim Abdel Sadek
karimabdel.bsky.social
Karim Abdel Sadek
@karimabdel.bsky.social
Incoming PhD, UC Berkeley

Interested in RL, AI Safety, Cooperative AI, TCS

https://karim-abdel.github.io
*New Paper*

🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal.

😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
July 8, 2025 at 5:16 PM
Reposted by Karim Abdel Sadek
CAIF's new and massive report on multi-agent AI risks will be really useful resource for the field
www.cooperativeai.com/post/new-rep...
Cooperative AIPlaintext Code Block
www.cooperativeai.com
February 21, 2025 at 2:24 PM
Reposted by Karim Abdel Sadek
A large group of us (spearheaded by Denizalp Goktas) have put out a position paper on paths towards foundation models for strategic decision-making. Language models still lack these capabilities so we'll need to build them: hal.science/hal-04925309...
February 18, 2025 at 6:33 PM
Reposted by Karim Abdel Sadek
Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker).
We performed the largest-ever comparison of these algorithms.
We find that they do not outperform generic policy gradient methods, such as PPO.
arxiv.org/abs/2502.08938
1/N
February 14, 2025 at 6:41 PM
Reposted by Karim Abdel Sadek
The 2025 Cooperative AI summer school (9-13 July 2025 near London) is now accepting applications, due March 7th!
www.cooperativeai.com/summer-schoo...
Cooperative AI
www.cooperativeai.com
January 9, 2025 at 7:25 PM
Reposted by Karim Abdel Sadek
The magic thing that humans do is a pretty good job at solving tasks under high uncertainty about the problem specification. We also frequently are capable of doing this collaboratively. I still do not see evidence that models can do any part of this.
December 21, 2024 at 1:08 AM
I will be at @neuripsconf.bsky.social this week!

Would love to chat about Multi-agent systems, RL, Human-AI Alignment, or anything interesting :)

I'm also applying for PhD programs this cycle, feel free to reach out for any advice!

More about me: karim-abdel.github.io
December 8, 2024 at 11:59 PM
Reposted by Karim Abdel Sadek
I give you a loaded coin, with some (unknown) probability 0<p<1 of landing Heads, and I ask you to generate a fair coin toss.

Great! We know how to do this! This is the Von Neumann trick: toss twice. If HH or TT, repeat; if HT or TH, return the first.

Problem solved? Not quite... This can be bad!
November 18, 2024 at 8:50 PM