Daniil Dmitriev
ddmitriev.bsky.social
Daniil Dmitriev
@ddmitriev.bsky.social
math of data science
Postdoc at UPenn
We obtain information-theoretically optimal list size and recovery error, and provide empirical comparison with prior methods.

link: arxiv.org/abs/2407.15792

Joint with @raresbuhai.bsky.social, Stefan Tiegel, Alex Wolters, Gleb Novikov, @amartyasanyal.bsky.social, David Steurer, and Fanny Yang.
Robust Mixture Learning when Outliers Overwhelm Small Groups
We study the problem of estimating the means of well-separated mixtures when an adversary may add arbitrary outliers. While strong guarantees are available when the outlier fraction is significantly s...
arxiv.org
December 10, 2024 at 8:31 PM
Our method works in the presence of large outliers if mixture components are spherical Gaussians, or, more generally, have bounded k-th sub-Gaussian moments.

We propose a reduction from the robust mixture learning problem to a well-studied list-decodable mean estimation problem.
December 10, 2024 at 8:31 PM
When the number of outliers is negligible compared to the smallest component, existing algorithms recover all means with optimal errors.

However, when the fraction of outliers becomes larger than the smallest component, prior methods suffer both in recovery error and list size.
December 10, 2024 at 8:31 PM