sierra-wyllie.bsky.social
@sierra-wyllie.bsky.social
Reposted
📢 New ICML 2025 paper!

Confidential Guardian: Cryptographically Prohibiting the Abuse of Model Abstention

🤔 Think model uncertainty can be trusted?
We show that it can be misused—and how to stop it!
Meet Mirage (our attack💥) & Confidential Guardian (our defense🛡️).

🧵1/10
June 2, 2025 at 2:38 PM