shira mitchell
@shiraamitchell.bsky.social
survey statistician at blue rose research 🏕
blog post: continued struggles with equivalent weights
November 4, 2025 at 9:06 PM
blog post: continued struggles with equivalent weights
blog post: Blue Rose Research is hiring !
We are looking for a teammate with expertise in both LLM tools and statistical modeling.
Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.
We are looking for a teammate with expertise in both LLM tools and statistical modeling.
Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.
October 28, 2025 at 8:32 PM
blog post: Blue Rose Research is hiring !
We are looking for a teammate with expertise in both LLM tools and statistical modeling.
Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.
We are looking for a teammate with expertise in both LLM tools and statistical modeling.
Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.
blog post: individualism doesn't work
typical machine learning loss looks at one individual at a time
but for MRP, we care about aggregates
typical machine learning loss looks at one individual at a time
but for MRP, we care about aggregates
October 22, 2025 at 1:30 PM
blog post: individualism doesn't work
typical machine learning loss looks at one individual at a time
but for MRP, we care about aggregates
typical machine learning loss looks at one individual at a time
but for MRP, we care about aggregates
blog post: MRPW
you've got a survey collected by someone else, and they gave you weights.
how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?
you've got a survey collected by someone else, and they gave you weights.
how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?
October 15, 2025 at 1:46 PM
blog post: MRPW
you've got a survey collected by someone else, and they gave you weights.
how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?
you've got a survey collected by someone else, and they gave you weights.
how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?
blog post: struggles with equivalent weights
you've done MRP.
someone asks you for survey weights.
how to get them ?
you've done MRP.
someone asks you for survey weights.
how to get them ?
October 7, 2025 at 11:56 PM
blog post: struggles with equivalent weights
you've done MRP.
someone asks you for survey weights.
how to get them ?
you've done MRP.
someone asks you for survey weights.
how to get them ?
blog post: beyond balancing
in midterms, voters tend to support the out party for balance
do polls still help predict midterms ? yes
in midterms, voters tend to support the out party for balance
do polls still help predict midterms ? yes
October 1, 2025 at 10:40 AM
blog post: beyond balancing
in midterms, voters tend to support the out party for balance
do polls still help predict midterms ? yes
in midterms, voters tend to support the out party for balance
do polls still help predict midterms ? yes
blog post: Fat Bear Week
Basu's Bears is a lesson in:
1) using auxiliary information (pre-salmon-feasting weights)
2) how bad an unbiased estimator can be
statmodeling.stat.columbia.edu/2025/09/23/s...
Basu's Bears is a lesson in:
1) using auxiliary information (pre-salmon-feasting weights)
2) how bad an unbiased estimator can be
statmodeling.stat.columbia.edu/2025/09/23/s...
September 23, 2025 at 8:19 PM
blog post: Fat Bear Week
Basu's Bears is a lesson in:
1) using auxiliary information (pre-salmon-feasting weights)
2) how bad an unbiased estimator can be
statmodeling.stat.columbia.edu/2025/09/23/s...
Basu's Bears is a lesson in:
1) using auxiliary information (pre-salmon-feasting weights)
2) how bad an unbiased estimator can be
statmodeling.stat.columbia.edu/2025/09/23/s...
blog post: random sampling is not leaving
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
September 16, 2025 at 9:01 PM
blog post: random sampling is not leaving
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
blog post: random sampling is not leaving
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
September 16, 2025 at 9:01 PM
blog post: random sampling is not leaving
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
we turned to response instrument Z because random sampling is "dead"
but does this method still rely on starting with random sampling ?
blog post on imputation (again):
we want E[Y|X] but X can be missing
@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:
random imputation should use Y
deterministic imputation shouldn't
statmodeling.stat.columbia.edu/2025/09/09/s...
we want E[Y|X] but X can be missing
@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:
random imputation should use Y
deterministic imputation shouldn't
statmodeling.stat.columbia.edu/2025/09/09/s...
September 9, 2025 at 8:22 PM
blog post on imputation (again):
we want E[Y|X] but X can be missing
@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:
random imputation should use Y
deterministic imputation shouldn't
statmodeling.stat.columbia.edu/2025/09/09/s...
we want E[Y|X] but X can be missing
@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:
random imputation should use Y
deterministic imputation shouldn't
statmodeling.stat.columbia.edu/2025/09/09/s...
blog post: connections between survey statistics and experimental design.
split-plot designs are analogous to cluster sampling.
blocking is analogous to stratification.
featuring an experiment by Arjun Potter and colleagues at NM-AIST !
split-plot designs are analogous to cluster sampling.
blocking is analogous to stratification.
featuring an experiment by Arjun Potter and colleagues at NM-AIST !
September 3, 2025 at 3:25 AM
blog post: connections between survey statistics and experimental design.
split-plot designs are analogous to cluster sampling.
blocking is analogous to stratification.
featuring an experiment by Arjun Potter and colleagues at NM-AIST !
split-plot designs are analogous to cluster sampling.
blocking is analogous to stratification.
featuring an experiment by Arjun Potter and colleagues at NM-AIST !
blog post: Thomas Lumley writes about Interviewing your Laptop
what are the problems with using LLMs as survey respondents ?
how are these similar to problems with poststratification ?
CC @tslumley.bsky.social
what are the problems with using LLMs as survey respondents ?
how are these similar to problems with poststratification ?
CC @tslumley.bsky.social
August 27, 2025 at 7:37 AM
blog post: Thomas Lumley writes about Interviewing your Laptop
what are the problems with using LLMs as survey respondents ?
how are these similar to problems with poststratification ?
CC @tslumley.bsky.social
what are the problems with using LLMs as survey respondents ?
how are these similar to problems with poststratification ?
CC @tslumley.bsky.social
blog post: answers from the BLS
2 weeks ago we learned about the CES employer survey that produces the jobs count.
we asked: why use employment size in stratification but not nonresponse adjustment ?
BLS responded !
statmodeling.stat.columbia.edu/2025/08/19/s...
2 weeks ago we learned about the CES employer survey that produces the jobs count.
we asked: why use employment size in stratification but not nonresponse adjustment ?
BLS responded !
statmodeling.stat.columbia.edu/2025/08/19/s...
August 19, 2025 at 8:53 PM
blog post: answers from the BLS
2 weeks ago we learned about the CES employer survey that produces the jobs count.
we asked: why use employment size in stratification but not nonresponse adjustment ?
BLS responded !
statmodeling.stat.columbia.edu/2025/08/19/s...
2 weeks ago we learned about the CES employer survey that produces the jobs count.
we asked: why use employment size in stratification but not nonresponse adjustment ?
BLS responded !
statmodeling.stat.columbia.edu/2025/08/19/s...
the BLS is so helpful in their communication !
August 13, 2025 at 8:53 PM
the BLS is so helpful in their communication !
blog post: 2nd helpings of the 2nd flavor of calibration 🍨🍨
in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).
but what happens for multinomial outcomes ?
a fun excuse to review IPF/raking 🍂
statmodeling.stat.columbia.edu/2025/08/12/s...
in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).
but what happens for multinomial outcomes ?
a fun excuse to review IPF/raking 🍂
statmodeling.stat.columbia.edu/2025/08/12/s...
August 12, 2025 at 8:26 PM
blog post: 2nd helpings of the 2nd flavor of calibration 🍨🍨
in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).
but what happens for multinomial outcomes ?
a fun excuse to review IPF/raking 🍂
statmodeling.stat.columbia.edu/2025/08/12/s...
in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).
but what happens for multinomial outcomes ?
a fun excuse to review IPF/raking 🍂
statmodeling.stat.columbia.edu/2025/08/12/s...
blog post: BLS Jobs Report
let's learn about the CES employer survey that produces the jobs count.
late reporting (a form of nonresponse) results in revisions.
my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?
let's learn about the CES employer survey that produces the jobs count.
late reporting (a form of nonresponse) results in revisions.
my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?
August 5, 2025 at 8:09 PM
blog post: BLS Jobs Report
let's learn about the CES employer survey that produces the jobs count.
late reporting (a form of nonresponse) results in revisions.
my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?
let's learn about the CES employer survey that produces the jobs count.
late reporting (a form of nonresponse) results in revisions.
my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?
blog post: adjusting for interest in politics
whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)
what if we can expand this set of X to include interest in politics ?
whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)
what if we can expand this set of X to include interest in politics ?
July 29, 2025 at 8:04 PM
blog post: adjusting for interest in politics
whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)
what if we can expand this set of X to include interest in politics ?
whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)
what if we can expand this set of X to include interest in politics ?
very excited to teach again soon at NM-AIST
July 29, 2025 at 12:36 AM
very excited to teach again soon at NM-AIST
blog post: a new paradigm for polling
so far we assumed response R is independent of outcome Y **within X**
but if R can depend on Y, what to do ?
one idea: use a response instrument Z
statmodeling.stat.columbia.edu/2025/07/22/s...
so far we assumed response R is independent of outcome Y **within X**
but if R can depend on Y, what to do ?
one idea: use a response instrument Z
statmodeling.stat.columbia.edu/2025/07/22/s...
July 23, 2025 at 12:20 AM
blog post: a new paradigm for polling
so far we assumed response R is independent of outcome Y **within X**
but if R can depend on Y, what to do ?
one idea: use a response instrument Z
statmodeling.stat.columbia.edu/2025/07/22/s...
so far we assumed response R is independent of outcome Y **within X**
but if R can depend on Y, what to do ?
one idea: use a response instrument Z
statmodeling.stat.columbia.edu/2025/07/22/s...
blog post about longitudinal/panel data:
panel data includes repeated surveys of the same people over time.
this structure can be incorporated into models using person-level effects.
but misspecifying the person-level effects distribution can cause bias.
panel data includes repeated surveys of the same people over time.
this structure can be incorporated into models using person-level effects.
but misspecifying the person-level effects distribution can cause bias.
July 15, 2025 at 8:56 PM
blog post about longitudinal/panel data:
panel data includes repeated surveys of the same people over time.
this structure can be incorporated into models using person-level effects.
but misspecifying the person-level effects distribution can cause bias.
panel data includes repeated surveys of the same people over time.
this structure can be incorporated into models using person-level effects.
but misspecifying the person-level effects distribution can cause bias.
blog post about imputation:
With nonresponse worsening, we want to adjust for a lot of covariates.
This often means handling many missing covariates.
In theory, fit one big model for everything. But how can practitioners handle this ?
With nonresponse worsening, we want to adjust for a lot of covariates.
This often means handling many missing covariates.
In theory, fit one big model for everything. But how can practitioners handle this ?
July 9, 2025 at 4:29 AM
blog post about imputation:
With nonresponse worsening, we want to adjust for a lot of covariates.
This often means handling many missing covariates.
In theory, fit one big model for everything. But how can practitioners handle this ?
With nonresponse worsening, we want to adjust for a lot of covariates.
This often means handling many missing covariates.
In theory, fit one big model for everything. But how can practitioners handle this ?
*really* excited for this !
love the name: Structural Zero.
Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):
love the name: Structural Zero.
Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):
July 7, 2025 at 2:59 PM
*really* excited for this !
love the name: Structural Zero.
Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):
love the name: Structural Zero.
Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):
blog post about Sparsified MRP:
With nonresponse worsening, we want to adjust for a lot of covariates.
Estimates from such big models will be unstable without a lot of data and/or regularization.
Have you seen MRP with sparsifying priors ?
statmodeling.stat.columbia.edu/2025/07/01/s...
With nonresponse worsening, we want to adjust for a lot of covariates.
Estimates from such big models will be unstable without a lot of data and/or regularization.
Have you seen MRP with sparsifying priors ?
statmodeling.stat.columbia.edu/2025/07/01/s...
July 2, 2025 at 2:15 PM
blog post about Sparsified MRP:
With nonresponse worsening, we want to adjust for a lot of covariates.
Estimates from such big models will be unstable without a lot of data and/or regularization.
Have you seen MRP with sparsifying priors ?
statmodeling.stat.columbia.edu/2025/07/01/s...
With nonresponse worsening, we want to adjust for a lot of covariates.
Estimates from such big models will be unstable without a lot of data and/or regularization.
Have you seen MRP with sparsifying priors ?
statmodeling.stat.columbia.edu/2025/07/01/s...
for connections to the causal inference literature, I recommend
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al
arxiv.org/abs/2305.18793
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al
arxiv.org/abs/2305.18793
June 30, 2025 at 11:41 AM
for connections to the causal inference literature, I recommend
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al
arxiv.org/abs/2305.18793
Peng Ding's excellent textbook highlighting work by Jonathan Hennessy et al and @lmiratrix.bsky.social et al
arxiv.org/abs/2305.18793
blog post about poststratification:
which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)
statmodeling.stat.columbia.edu/2025/06/24/s...
which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)
statmodeling.stat.columbia.edu/2025/06/24/s...
June 25, 2025 at 4:27 AM
blog post about poststratification:
which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)
statmodeling.stat.columbia.edu/2025/06/24/s...
which estimate is best ?
1. unadjusted sample mean
2. classical poststratification
3. regularized poststratification (e.g. MRP)
statmodeling.stat.columbia.edu/2025/06/24/s...