Lightnews — Scholar-powered news

Stephen Pfohl

@stephenpfohl.bsky.social

770 followers 1.2K following 17 posts

Research scientist at Google. Previously Stanford Biomedical Informatics. Researching #fairness #equity #robustness #transparency #causality #healthcare

Posts Replies Media Videos

Stephen Pfohl

@stephenpfohl.bsky.social

It looks like they updated the article with a correction!

November 3, 2025 at 10:32 PM

Stephen Pfohl

@stephenpfohl.bsky.social

This is not true (and I'm surprised by the bad reporting here from 404). arXiv is no longer accepting *review papers* unless they are peer reviewed. This has no effect on the submission of research articles. See the original post: blog.arxiv.org/2025/10/31/a....

Attention Authors: Updated Practice for Review Articles and Position Papers in arXiv CS Category – arXiv blog

blog.arxiv.org

November 3, 2025 at 6:32 PM

Stephen Pfohl

@stephenpfohl.bsky.social

This work was a collaboration with Natalie Harris, Chirag Nagpal, David Madras, Vishwali Mhasawade, Olawale Salaudeen, @adoubleva.bsky.social, Shannon Sequeira, Santiago Arciniegas, Lillian Sung, Nnamdi Ezeanochie, Heather Cole-Lewis, @kat-heller.bsky.social, Sanmi Koyejo, Alexander D'Amour.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

2. downstream context (the fairness or equity implications that a model has when used as a component of a policy/intervention in a specific context).

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

1. upstream context (e.g., understanding the role of social and structural determinants of disparities and their impact on selection, measurement, and problem formulation)

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

We advocate for an approach that uses interdisciplinary expertise and domain knowledge to ground the analytic approach to model evaluation in both:

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

Beyond characterization of modeling implications, we argue that fairness (as well as related concepts such as equity or justice) is best understood not as a property of a model, but rather as a property of a policy or intervention that leverages the model in a specific sociotechnical context.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

3. We provide evaluation methodology for controlling for confounding and conditional independence testing. These methods complement standard disaggregated evaluation to provide insight into why model performance differs across subgroups.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

2. Observing model performance differences thus motivates deeper investigation to understand the causes of distributional differences across subgroups and to disambiguate them from observational biases (e.g., selection bias) and from model estimation error.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

A few concrete practical takeaways:

1. Our results show that if it is of interest to model well outcomes that may be disparate across subgroups, we should not in general expect parity in model performance across subgroups.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

3. How do model performance and fairness properties that change under different assumptions on the data generating process (reflecting different causal processes and structural causes of disparity) and mechanisms of selection bias (rendering data misrepresentative of the ideal target population)?

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

2. When and why do models that explicitly use subgroup membership information for prediction behave differently from those that do not?

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

A few of the key questions that we grappled with in this work included:

1. Why do models that predict outcomes well (even optimally) for all subgroups still exhibit systematic differences in performance across subgroups?

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

To summarize, we conducted a deep dive into some of the more challenging conceptual issues when it comes to evaluating machine learning models across subgroups, as is typically done to evaluate fairness or robustness.

October 28, 2025 at 12:36 AM

Stephen Pfohl

@stephenpfohl.bsky.social

This is (unfortunately) the required style for Nature journals

December 20, 2024 at 3:11 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news