Ryan webster
banner
ryanwebby.bsky.social
Ryan webster
@ryanwebby.bsky.social
Privacy in AI @inria rennes
TLDR; We show a real-world use case for MI: we may audit privacy claims. Perhaps it's time to advertise privacy (on MI benchmarks) alongside performance benchmarks for CLIP models (see DP-CAP, Sander et al for some great work in this direction). (11/10)
October 22, 2025 at 7:34 PM
What went wrong? We aren't sure, but perhaps the blurring wasn't strong enough (the bottom left shows the default of the codebase), or face detection was faulty. Either way, privacy wasn't really protected as desired. (10/10)
October 22, 2025 at 7:34 PM
Lets look at a real world use case. The Datacomp suite of CLIP models claimed to put privacy first by blurring faces before training. However, we found that even after a strict facial crop, Datacomp models could still identify people. This was not the case when heavy blurring was performed. (9/10)
October 22, 2025 at 7:34 PM
We found around 1/3 - 1/2 of the ~10k individuals in VGGFace2 can be identified in terms of the (harder) setting (2), i.e. Alice learns the name. (8/10)
October 22, 2025 at 7:34 PM
Distractors are important; the more distractors we submit, the less likely the model can guess who it is correctly (i.e., we control the chance of a false positive). The more names, the more confident we can be; we submit up to 1M diverse distractor names taken from wikipedia. (7/10)
October 22, 2025 at 7:34 PM
For name generation (3), we showed names can be directly extracted from CLIP using a _language only_ model (mistral), and a bit of magic from reinforcement learning. (6/10)
October 22, 2025 at 7:34 PM
She counts how many times CLIP maps Bob's facial images to (1) the ground truth or (2) the same name repeatedly. This is her "attack function" used to make a membership decision. (5/10)
October 22, 2025 at 7:34 PM
We consider 3 setups where Alice submits images of Bob to identify him by name.
(1) Alice has Bob's name (see Hintersdorf et al) (2) Alice has a phonebook with Bob's name (3) Alice generates a set of plausible names (then performs 2). (4/10)
October 22, 2025 at 7:34 PM
However, in the specific instance "Was this person's face/name used to train a VLM," we can craft a robust attack. Knowledge of training data isn't necessary; we assume a model cannot a-priori identify an individual by name better than guessing (aka a "ranking counterfactual" argument). (3/10)
October 22, 2025 at 7:34 PM
Membership Inference (MI) is the process of determining whether a data sample was used to train a model. Unfortunately, trustworthy MI vs. foundational model's is extremely hard. In standard MI, we cannot know if we've made a false accusation unless we have some knowledge of the training set. (2/10)
October 22, 2025 at 7:34 PM