🧵 Papers in order of presentation below:
🧵 Papers in order of presentation below:
Confidential Guardian: Cryptographically Prohibiting the Abuse of Model Abstention
🤔 Think model uncertainty can be trusted?
We show that it can be misused—and how to stop it!
Meet Mirage (our attack💥) & Confidential Guardian (our defense🛡️).
🧵1/10
Confidential Guardian: Cryptographically Prohibiting the Abuse of Model Abstention
🤔 Think model uncertainty can be trusted?
We show that it can be misused—and how to stop it!
Meet Mirage (our attack💥) & Confidential Guardian (our defense🛡️).
🧵1/10