Lightnews — Scholar-powered news

shira mitchell

@shiraamitchell.bsky.social

blog post: continued struggles with equivalent weights

November 4, 2025 at 9:06 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: Blue Rose Research is hiring !

We are looking for a teammate with expertise in both LLM tools and statistical modeling.

Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.

October 28, 2025 at 8:32 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: individualism doesn't work

typical machine learning loss looks at one individual at a time

but for MRP, we care about aggregates

October 22, 2025 at 1:30 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: MRPW

you've got a survey collected by someone else, and they gave you weights.

how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?

October 15, 2025 at 1:46 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: struggles with equivalent weights

you've done MRP.

someone asks you for survey weights.

how to get them ?

October 7, 2025 at 11:56 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: beyond balancing

in midterms, voters tend to support the out party for balance

do polls still help predict midterms ? yes

October 1, 2025 at 10:40 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: Fat Bear Week

Basu's Bears is a lesson in:

1) using auxiliary information (pre-salmon-feasting weights)

2) how bad an unbiased estimator can be

statmodeling.stat.columbia.edu/2025/09/23/s...

September 23, 2025 at 8:19 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?

September 16, 2025 at 9:01 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?

September 16, 2025 at 9:01 PM

shira mitchell

@shiraamitchell.bsky.social

blog post on imputation (again):

we want E[Y|X] but X can be missing

@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:

random imputation should use Y
deterministic imputation shouldn't

statmodeling.stat.columbia.edu/2025/09/09/s...

September 9, 2025 at 8:22 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: connections between survey statistics and experimental design.

split-plot designs are analogous to cluster sampling.

blocking is analogous to stratification.

featuring an experiment by Arjun Potter and colleagues at NM-AIST !

September 3, 2025 at 3:25 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: Thomas Lumley writes about Interviewing your Laptop

what are the problems with using LLMs as survey respondents ?

how are these similar to problems with poststratification ?

CC @tslumley.bsky.social

August 27, 2025 at 7:37 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: answers from the BLS

2 weeks ago we learned about the CES employer survey that produces the jobs count.

we asked: why use employment size in stratification but not nonresponse adjustment ?

BLS responded !

statmodeling.stat.columbia.edu/2025/08/19/s...

August 19, 2025 at 8:53 PM

shira mitchell

@shiraamitchell.bsky.social

the BLS is so helpful in their communication !

August 13, 2025 at 8:53 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: 2nd helpings of the 2nd flavor of calibration 🍨🍨

in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).

but what happens for multinomial outcomes ?

a fun excuse to review IPF/raking 🍂

statmodeling.stat.columbia.edu/2025/08/12/s...

August 12, 2025 at 8:26 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: BLS Jobs Report

let's learn about the CES employer survey that produces the jobs count.

late reporting (a form of nonresponse) results in revisions.

my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?

August 5, 2025 at 8:09 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: adjusting for interest in politics

whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)

what if we can expand this set of X to include interest in politics ?

July 29, 2025 at 8:04 PM

shira mitchell

@shiraamitchell.bsky.social

very excited to teach again soon at NM-AIST

July 29, 2025 at 12:36 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: a new paradigm for polling

so far we assumed response R is independent of outcome Y **within X**

but if R can depend on Y, what to do ?

one idea: use a response instrument Z

statmodeling.stat.columbia.edu/2025/07/22/s...

July 23, 2025 at 12:20 AM

shira mitchell

@shiraamitchell.bsky.social

blog post about longitudinal/panel data:

panel data includes repeated surveys of the same people over time.

this structure can be incorporated into models using person-level effects.

but misspecifying the person-level effects distribution can cause bias.

July 15, 2025 at 8:56 PM

shira mitchell

@shiraamitchell.bsky.social

blog post about imputation:

With nonresponse worsening, we want to adjust for a lot of covariates.

This often means handling many missing covariates.

In theory, fit one big model for everything. But how can practitioners handle this ?

July 9, 2025 at 4:29 AM

shira mitchell

@shiraamitchell.bsky.social

*really* excited for this !

love the name: Structural Zero.

Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):

July 7, 2025 at 2:59 PM

shira mitchell

@shiraamitchell.bsky.social

blog post about Sparsified MRP:

With nonresponse worsening, we want to adjust for a lot of covariates.

Estimates from such big models will be unstable without a lot of data and/or regularization.

Have you seen MRP with sparsifying priors ?

statmodeling.stat.columbia.edu/2025/07/01/s...

July 2, 2025 at 2:15 PM

shira mitchell

@shiraamitchell.bsky.social

for connections to the causal inference literature, I recommend
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al

arxiv.org/abs/2305.18793

June 30, 2025 at 11:41 AM

shira mitchell

@shiraamitchell.bsky.social

blog post about poststratification:

which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)

statmodeling.stat.columbia.edu/2025/06/24/s...

June 25, 2025 at 4:27 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news