Lightnews — Scholar-powered news

Augustin Godinot

@grodino.bsky.social

44 followers 170 following 11 posts

Algorithm Auditing | CS PhD student @ INRIA/IRISA/PEReN | Visiting UofT @ CleverHans Lab !

Posts Replies Media Videos

Augustin Godinot

@grodino.bsky.social

In addition to norms, there are some works on technical tools to *try* to reduce risks: detecting data arxiv.org/abs/2507.20708, model arxiv.org/abs/2505.04796 or explanations openreview.net/pdf?id=3vmKQ... manipulations (non exhaustive).
Nice to read about those risks from an other pov!

October 24, 2025 at 8:56 AM

Augustin Godinot

@grodino.bsky.social

🧵6/6 If you want to read more about this, I encourage you to read the paper, but not only!
There is a lot of exciting works on robust audits, here are a few I enjoyed:
arxiv.org/abs/2402.02675
arxiv.org/abs/2504.00874
arxiv.org/abs/2502.03773
arxiv.org/abs/2305.13883
arxiv.org/abs/2410.02777

May 9, 2025 at 3:38 PM

Augustin Godinot

@grodino.bsky.social

🧵5/6 In this paper, we formalize the second approach as a search for efficient "audit priors".
We instantiate our framework with a simple idea: just look at the accuracy of the platform's answers.
Our experiments show that this can help reduce the amount of unfairness a platform could hide.

Figure showing how much a prior can help reduce the unfairness can a platform hide by manipulating its answers.

May 9, 2025 at 3:38 PM

Augustin Godinot

@grodino.bsky.social

🧵4/6 There are two main approaches to avoid manipulations.
🔒Crypto guarantees: the model provider is forced to commit their model and sign every answer.
📐Clever ML tricks: the auditor uses information about the model (training data, model structure, ...) to understand what is a "good answer".

May 9, 2025 at 3:38 PM

Augustin Godinot

@grodino.bsky.social

🧵3/6 You know the metric, you know the questions, and I don't have access to your model.
Thus, nothing prevents you from manipulating the answers of your model to pass the audit.

And this is very easy! In fact, any fairness mitigation method can be transformed into an audit manipulation attack.

Screenshot of the equation describing how to implement an audit manipulation in practice.

May 9, 2025 at 3:38 PM

Augustin Godinot

@grodino.bsky.social

🧵2/6 link: arxiv.org/pdf/2505.04796

An audit is pretty straightforward.
1/ I, the auditor 🕵️ come up with questions to ask your model.
2/ You, the platform 😈 answer my questions.
3/ I look at your answers and decide whether your system abides by the law by computing a series of aggregate metrics.

arxiv.org

May 9, 2025 at 3:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news