Lightnews — Scholar-powered news

Bruno Ferman

@brunoferman.bsky.social

Thanks, Julian!!!

May 1, 2025 at 7:29 PM

Bruno Ferman

@brunoferman.bsky.social

Thanks, Arin!!!

May 1, 2025 at 7:28 PM

Bruno Ferman

@brunoferman.bsky.social

That's a quick overview!

For more details, check out the full survey 📚👇

Link: arxiv.org/abs/2504.19841

Hope you find it helpful!
Feedback welcome. 🧠✍️

Inference with few treated units

In many causal inference applications, only one or a few units (or clusters of units) are treated. An important challenge in such settings is that standard inference methods that rely on asymptotic th...

arxiv.org

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

15/
Applied folks: we hope this serves as a warning that standard inference may fail with few treated units + guidance on choosing alternatives.
Econometricians: we wanted to provide a state-of-the-art overview — and a call for new methods based on alternative assumptions!

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

14/
And show some equivalences:

e.g., wild-bootstrap (with null imposed) asymptotically equivalent to sign-changes when N₁ is fixed and N₀ → ∞

⇒ theoretical justification for wild-bootstrap in these settings

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

13/
We also provide finite-N₀ improvements for some methods, such as Conley-Taber and sign-changes.

Free lunch: gains with finite N₀ & asymptotic equivalent when N₀ → ∞ (with N₁ fixed)

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

12/
What if we have >1 treated (but still few)?

More info on treated ⇒ alternatives: sign-changes, Behrens-Fisher solutions, etc

Relax some assumptions relative to previous methods (but need new ones!)
⚡Power may be an issue when N₁ is very small

Many relevant trade-offs!

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

11/
In these extreme cases: need to impose strong restrictions on treatment effect heterogeneity!

If interested, see discussion in Section 4.1.3 on inference on sharp nulls, inference on realized treatment effects, prediction intervals, and sensitivity analysis.

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

10/
📌Extrapolate from time series

Learn about treated error using pre-treatment residuals

⚡Flip assumptions
Need time series restrictions (stationarity) but relax assumptions on cross-section

Challenges arise when counterfactuals are estimated via high-dimensional approaches

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

9/
Ferman and Pinto (2019): allow for heteroskedasticity that can be estimated based on observables.

Example: when units have different variances due to variation in population sizes.

See this old Twitter thread: x.com/bruno_ferman...

x.com

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

8/
📌Extrapolate (learn) from control units

Learn the distribution of the treated error using controls' residuals (à la Conley and Taber)
⚡Key assumption: Errors of treated and control units must have the same distribution (homoskedasticity)
No restriction on time series!

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

7/
Survey is organized based on data availability.

📌Limit case:
One treated unit & one treated period.

Enough info from the treated to construct an estimator — but no info from the treated to learn its distribution!

⚡Solution:
We need to *extrapolate* ⇒ stronger assumptions!

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

6/
We focus on model-based approaches, more common in metrics

📚 Nice citation from Haavelmo to justify this framework + marvel movies to help make the point 🕷️: )

We also discuss design-based approaches at the end

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

5/
Important:
📌Problems arise when the *number* of treated units is small

✅Standard methods are usually fine with 40 or 50 treated units, even when the *share* of treated is small.

Feel free to cite our survey to justify sticking to standard methods when that's your case!😉

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

4/
Extreme case: you have only 1 treated and N₀ controls.

The true variance is σ₁² + σ₀²/N₀.

But with only one treated, you just don’t have enough info to estimate σ₁² using only the treated!

Robust SEs simply set σ̂₁² = 0! 😵‍💫

σ₁²: var of treated
σ₀²: var of control

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

3/
Example to illustrate problem: comparison of means

Robust SEs estimate the variance of treated (controls) using only treated (controls) data

✅ Great with many treated/many controls!
↪️ Allow for ≠ distributions of treated/control errors

❗ Go bad with few treated units...

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

2/
🗣️Main message

Few treated ⇒ need to rely on stronger assumptions

Many alternatives: varying in data requirements, assumptions, etc

Choice is highly context-specific. We’ll help you navigate that!

Cover cross-section and panel data (Regression, Matching, DiD, SC, etc)

April 29, 2025 at 2:18 PM

Bruno Ferman

@brunoferman.bsky.social

1/
Link to paper: arxiv.org/abs/2504.19841

🚨Problem
Few treated ⇒ standard methods (e.g., robust/clustered SEs) can go wrong. Even if total N is large!

📌Example
DiD with 1 treated cluster, clustered SEs underestimate true var by a factor of N. Expect over-rejections >60%!

Inference with few treated units

In many causal inference applications, only one or a few units (or clusters of units) are treated. An important challenge in such settings is that standard inference methods that rely on asymptotic th...

arxiv.org

April 29, 2025 at 2:18 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news