shira mitchell
banner
shiraamitchell.bsky.social
shira mitchell
@shiraamitchell.bsky.social
survey statistician at blue rose research 🏕
blog post: continued struggles with equivalent weights
November 4, 2025 at 9:06 PM
blog post: Blue Rose Research is hiring !

We are looking for a teammate with expertise in both LLM tools and statistical modeling.

Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.
October 28, 2025 at 8:32 PM
blog post: individualism doesn't work

typical machine learning loss looks at one individual at a time

but for MRP, we care about aggregates
October 22, 2025 at 1:30 PM
blog post: MRPW

you've got a survey collected by someone else, and they gave you weights.

how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?
October 15, 2025 at 1:46 PM
blog post: struggles with equivalent weights

you've done MRP.

someone asks you for survey weights.

how to get them ?
October 7, 2025 at 11:56 PM
blog post: beyond balancing

in midterms, voters tend to support the out party for balance

do polls still help predict midterms ? yes
October 1, 2025 at 10:40 AM
blog post: Fat Bear Week

Basu's Bears is a lesson in:

1) using auxiliary information (pre-salmon-feasting weights)

2) how bad an unbiased estimator can be

statmodeling.stat.columbia.edu/2025/09/23/s...
September 23, 2025 at 8:19 PM
blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?
September 16, 2025 at 9:01 PM
blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?
September 16, 2025 at 9:01 PM
blog post on imputation (again):

we want E[Y|X] but X can be missing

@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:

random imputation should use Y
deterministic imputation shouldn't

statmodeling.stat.columbia.edu/2025/09/09/s...
September 9, 2025 at 8:22 PM
blog post: connections between survey statistics and experimental design.

split-plot designs are analogous to cluster sampling.

blocking is analogous to stratification.

featuring an experiment by Arjun Potter and colleagues at NM-AIST !
September 3, 2025 at 3:25 AM
blog post: Thomas Lumley writes about Interviewing your Laptop

what are the problems with using LLMs as survey respondents ?

how are these similar to problems with poststratification ?

CC @tslumley.bsky.social
August 27, 2025 at 7:37 AM
blog post: answers from the BLS

2 weeks ago we learned about the CES employer survey that produces the jobs count.

we asked: why use employment size in stratification but not nonresponse adjustment ?

BLS responded !

statmodeling.stat.columbia.edu/2025/08/19/s...
August 19, 2025 at 8:53 PM
the BLS is so helpful in their communication !
August 13, 2025 at 8:53 PM
blog post: 2nd helpings of the 2nd flavor of calibration 🍨🍨

in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).

but what happens for multinomial outcomes ?

a fun excuse to review IPF/raking 🍂

statmodeling.stat.columbia.edu/2025/08/12/s...
August 12, 2025 at 8:26 PM
blog post: BLS Jobs Report

let's learn about the CES employer survey that produces the jobs count.

late reporting (a form of nonresponse) results in revisions.

my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?
August 5, 2025 at 8:09 PM
blog post: adjusting for interest in politics

whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)

what if we can expand this set of X to include interest in politics ?
July 29, 2025 at 8:04 PM
very excited to teach again soon at NM-AIST
July 29, 2025 at 12:36 AM
blog post: a new paradigm for polling

so far we assumed response R is independent of outcome Y **within X**

but if R can depend on Y, what to do ?

one idea: use a response instrument Z

statmodeling.stat.columbia.edu/2025/07/22/s...
July 23, 2025 at 12:20 AM
blog post about longitudinal/panel data:

panel data includes repeated surveys of the same people over time.

this structure can be incorporated into models using person-level effects.

but misspecifying the person-level effects distribution can cause bias.
July 15, 2025 at 8:56 PM
blog post about imputation:

With nonresponse worsening, we want to adjust for a lot of covariates.

This often means handling many missing covariates.

In theory, fit one big model for everything. But how can practitioners handle this ?
July 9, 2025 at 4:29 AM
*really* excited for this !

love the name: Structural Zero.

Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):
July 7, 2025 at 2:59 PM
blog post about Sparsified MRP:

With nonresponse worsening, we want to adjust for a lot of covariates.

Estimates from such big models will be unstable without a lot of data and/or regularization.

Have you seen MRP with sparsifying priors ?

statmodeling.stat.columbia.edu/2025/07/01/s...
July 2, 2025 at 2:15 PM
for connections to the causal inference literature, I recommend
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al

arxiv.org/abs/2305.18793
June 30, 2025 at 11:41 AM
blog post about poststratification:

which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)

statmodeling.stat.columbia.edu/2025/06/24/s...
June 25, 2025 at 4:27 AM