sorelle
banner
friedler.net
sorelle
@friedler.net
CS prof at Haverford, Chair @acm.org U.S. tech policy, Brookings nonres Senior Fellow, former White House OSTP tech policy, co-author AI Bill of Rights, research on AI and society, @facct.bsky.social co-founder
formerly @kdphd 🐦
sorelle.friedler.net
September 11, 2025 at 1:15 PM
Trump's AI Action Plan released today aims to use federal procurement policy to shape the speech AI systems can generate to be free from "ideological bias."

On at least one political issue, the AI platforms are already in agreement.
July 24, 2025 at 1:38 AM
And - no surprise - this holds across identity groups and across APIs. AI filters have the same problem and block speech generation.

In work w/ @metaxa.net and students we find that identity-related text is 2-3x more likely to be *incorrectly* filtered than other text.

arxiv.org/abs/2409.13725
May 3, 2025 at 7:30 PM
You can calculate responsiveness scores too! We've released our code, including a quickstart guide.

We'd love to hear about if or how you find them useful.

github.com/ustunb/reachml

...4/
April 24, 2025 at 4:37 PM
Instead, we score features based on the proportion of changes to that single feature that would lead to recourse.

We call these responsiveness scores and find that they can successfully identify features that individuals can change to get a better outcome. ...3/
April 24, 2025 at 4:37 PM
In work with @harrycheon.bsky.social @anniewernerfelt.bsky.social @berkustun.bsky.social we show that many features highlighted by SHAP and LIME are non-responsive: they can't be changed (like age) or wouldn't lead to a better model outcome (e.g., getting a loan) even if you did change them!... 2/
April 24, 2025 at 4:37 PM
Hey AI folks - stop using SHAP! It won't help you debug [1], won't catch discrimination [2], and makes no sense for feature importance [3].

Plus - as we show - it also won't give recourse.

In a paper at #ICLR we introduce feature responsiveness scores... 1/

arxiv.org/pdf/2410.22598
April 24, 2025 at 4:37 PM
Our in-progress work shows that across AI systems identity-related speech (whether about marginalized or dominant groups) is more likely to be incorrectly flagged.

arxiv.org/abs/2409.13725
November 21, 2024 at 2:06 PM
In a recent audit (with @metaxa.net and students) we found that even some PG-rated TV scripts get blocked by OpenAI's automated content moderation filter.

Press release description: ai.seas.upenn.edu/news/censori...

Actual paper: facctconference.org/static/paper...
November 21, 2024 at 2:06 PM