Emily Byun
@yewonbyun.bsky.social
PhD Student in Machine Learning at CMU. yewonbyun.github.io
10/ Empirically, we observe large gains in estimation performance (lower MSE + tighter confidence intervals with valid coverage) across diverse computational social science tasks, with benefits most pronounced in low label regimes.
October 10, 2025 at 4:12 PM
10/ Empirically, we observe large gains in estimation performance (lower MSE + tighter confidence intervals with valid coverage) across diverse computational social science tasks, with benefits most pronounced in low label regimes.
2/ In limited labeled regimes, LLMs provide practitioners a cheap alternative to attain imperfect labels and even generate entirely new synthetic samples
October 10, 2025 at 4:12 PM
2/ In limited labeled regimes, LLMs provide practitioners a cheap alternative to attain imperfect labels and even generate entirely new synthetic samples
💡Can we trust synthetic data for statistical inference?
We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data
We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data
October 10, 2025 at 4:12 PM
💡Can we trust synthetic data for statistical inference?
We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data
We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data