Lightnews — Scholar-powered news

Christopher Boyer

@cboyer.bsky.social

Aren’t you just calculating the residualized pseudo-outcomes (so this all fits nicely in the semiparametric noncentered influence function literature)?

October 25, 2025 at 1:30 PM

Christopher Boyer

@cboyer.bsky.social

I have a similar take but for stats/data handlers who don’t have and don’t want to gain a familiarity for the process by which the data are made. E.g. engaging with the EMR signified and signifier doom loop.

October 10, 2025 at 7:03 PM

Christopher Boyer

@cboyer.bsky.social

Yes, this is a great example!

October 9, 2025 at 4:28 PM

Reposted by Christopher Boyer

Pausal Zivference

@pausalz.bsky.social

I have an interesting case study on this actually. So at the beginning of the semester, I was preparing a bit on the (mis)use of LLMs for the course I co-teach

One of things I did was have it summarize one of my own papers, since people say "it's so good at it"
arxiv.org/abs/2503.02789

Accounting for Missing Data in Public Health Research Using a Synthesis of Statistical and Mathematical Models

Introduction: Missing data is a challenge to medical research. Accounting for missing data by imputing or weighting conditional on covariates relies on the variable with missingness being observed at ...

arxiv.org

October 9, 2025 at 3:05 PM

Christopher Boyer

@cboyer.bsky.social

And as a species we’re so badly wired for finding needles in the haystacks of systems we didn’t design and barely understand.

October 9, 2025 at 2:13 PM

Christopher Boyer

@cboyer.bsky.social

More complex stuff is doubly dangerous because some of the models (ahem Claude code) will generate so much code for you but then will make dumbest mistake (or nefarious stuff like functions with the right name that do nothing or generating fake data) and now you have a needle in haystack problem.

October 9, 2025 at 2:07 PM

Christopher Boyer

@cboyer.bsky.social

✅ Derive model performance statistics (IP weighting, standardization, doubly-robust) that allow the candidate prediction model to be misspecified.
✅ Theoretical results: identifiability conditions and efficiency.
✅ Simulation & applied examples showing where naïve models fail.

October 8, 2025 at 2:36 PM

Christopher Boyer

@cboyer.bsky.social

Our contributions:
✅ Formal definition of “counterfactual prediction estimands.”
✅ Derive estimators combining causal inference (IP weighting, standardization) with predictive modeling that allow for separation between covariates for confounding control and covariates for prediction.

October 8, 2025 at 2:36 PM

Christopher Boyer

@cboyer.bsky.social

Examples:

- Differences in post-baseline treatment policies between training and target population.
- Clinical decision support tools that are meant to inform treatment adoption.
- Removal of undesirable events or features in training data that are unreflective of the target population.

October 8, 2025 at 2:36 PM

Christopher Boyer

@cboyer.bsky.social

Why does this matter?

Many common tasks in clinical prediction modeling target outcomes or performance statistics under hypothetical interventions (either explicitly or implicitly).

October 8, 2025 at 2:36 PM

Christopher Boyer

@cboyer.bsky.social

Examples:

- Differences in post-baseline treatment policies between training and target population.
- Clinical decision support tools that are meant to inform treatment adoption.
- Removal of undesirable events or features in training data that are unreflective of the target population.

October 8, 2025 at 2:24 PM

Christopher Boyer

@cboyer.bsky.social

Why does this matter?

Many common tasks in clinical prediction modeling target outcomes or performance statistics under hypothetical interventions (either explicitly or implicitly).

October 8, 2025 at 2:24 PM

Christopher Boyer

@cboyer.bsky.social

Yes just got it!

August 24, 2025 at 5:55 PM

Christopher Boyer

@cboyer.bsky.social

Thank you!

August 24, 2025 at 5:17 PM

Christopher Boyer

@cboyer.bsky.social

Also frontier seems to be concurrent prospective trial plus observational study (although obviously this has been around forever eg WHI) pmc.ncbi.nlm.nih.gov/articles/PMC...

Prospective benchmarking of an observational analysis in the SWEDEHEART registry against the REDUCE-AMI randomized trial

Prospective benchmarking of an observational analysis against a randomized trial increases confidence in the benchmarking process as it relies exclusively on aligning the protocol of the trial and the...

pmc.ncbi.nlm.nih.gov

August 12, 2025 at 11:01 AM

Christopher Boyer

@cboyer.bsky.social

Not to say that both use cases don’t still have their issues of course!

August 12, 2025 at 10:58 AM

Christopher Boyer

@cboyer.bsky.social

Increasingly think that some of the best uses of TTE are complimentary, e.g. 1) postmarket in a space where you have premarket RCT that has helped set bounds of reasonable effects and maybe identified negative controls, or 2) in an emerging area to help argue for potential equipoise for an RCT

August 12, 2025 at 10:51 AM

Christopher Boyer

@cboyer.bsky.social

Depends on whether you’re interpreting it as SATE or PATE. If former then your sample has no power users and repeated randomizations using same procedure over fixed sample produces no bias. Now if this error raises suspicion of other randomization errors then maybe not?

June 18, 2025 at 1:18 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news