Lightnews — Scholar-powered news

Luke Guerdan

@lukeguerdan.bsky.social

PhD student @ Carnegie Mellon University
I design tools and processes to support principled evaluation of AI systems.
lukeguerdan.com

Posts Replies Media Videos

Luke Guerdan

@lukeguerdan.bsky.social

📄 arxiv.org/abs/2507.02819

This work was in collaboration with the amazing team @devsaxena.bsky.social (co-first author), @schancellor.bsky.social, @zstevenwu.bsky.social , and @kenholstein.bsky.social

Thank you for making my first adventure into qualitative research a delightful experience :)

Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks

Data scientists often formulate predictive modeling tasks involving fuzzy, hard-to-define concepts, such as the "authenticity" of student writing or the "healthcare need" of a patient. Yet the process...

arxiv.org

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

Our paper offers design implications to support this, such as:

- Protocols to help data scientists identify minimum standards for validity and other criteria, tailored to their specific application context
- Tools designed to help data scientists identify and apply strategies more effectively

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

The challenge for HCI, CSCW, and ML is not to *replace* these bricolage practices with rigid top-down planning, but to develop scaffolding that enhances the rigor of bricolage while preserving creativity and adaptability

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

Yet from urban planning to software engineering, history is rife with examples where rigid top-down interventions have failed while bottom-up alternatives designed to better scaffold *existing* practices succeeded

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

What do these findings mean for how we improve target variable construction going forward? We might be tempted to more stringently enforce a rigid "top-down planning approach" to measurement, in which data scientists more carefully define construct → design operationalization → collect data

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

How do data scientists evaluate validity? They treat their target variable definition as a tangible object to be scrutinized. They "poke holes" in their definition then "patch" them. They apply a variety of "spot checks" to reconcile their theoretical understanding of a concept with observed labels

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

Data scientists navigate this balancing act by adaptively applying (re)formulation strategies

For example, they use "swapping" to change target variables when the first has unanticipated challenges, or "composing" to capture complementary dimensions of a concept being captured in a target variable

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

While engaging in bricolage, data scientists balance the validity of their target variable with other criteria, such as:
💡 Simplicity
⚙️ Resource requirements
🎯 Predictive performance
🌎 Portability

An illustration of the target variable construction process presented in our findings. During target variable construction, data scientists specify an initial prediction task based on their available data, then iteratively refine their prediction task by applying (re)formulation strategies. Data scientists proceed with their final prediction task if it satisfies all criteria, or discontinue their project if strategies are exhausted.

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

We find that target variable construction is a *bricolage practice*, in which data scientists creatively "make do" with the limited resources at hand

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

To explore this tension, we interviewed 15 data scientists from education and healthcare sectors to understand their practices, challenges, and perceived opportunities for target variable construction in predictive modeling

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

Traditional measurement theory assumes a top-down workflow, where data is collected to fit a study's goals (define construct → design operationalization → collect data)

In contrast, data scientists are often forced to reconcile their measurement goals with *existing* data

October 14, 2025 at 2:54 PM

Luke Guerdan

@lukeguerdan.bsky.social

You are eligible to participate if you have experience designing evaluations that use both (1) an LLM-as-a-judge and (2) a rubric to rate GenAI outputs. We welcome participants from all professional roles. Participants must be 18+ and be located in the U.S.

August 19, 2025 at 7:46 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news