UMass Machine Learning and Friends Lunch | Main / Mixture Models Robustness And Sum-of-Squares Proofs

Abstract: We use the Sum of Squares (SoS) method to develop a new efficient algorithm for clustering and mean estimation in well-separated high-dimensional mixture models, substantially improving upon the statistical guarantees achieved by previous efficient algorithms. In particular, we study mixtures of k distributions, where every pair of distributions has means separated by at least k^epsilon for any epsilon > 0. In the special case of spherical Gaussian mixtures, we give a k^O(1/epsilon^2)-time algorithm that learns the means of the components in the mixture and accurately clusters samples from the mixture. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k^1/4.

Our techniques are based on adapting algorithmic ideas from robust statistics, and are potentially of independent interest. Our main algorithm for learning mixture models provides an entirely SoS interpretation of the convex programming framework of [Diakonikolas et al, FOCS 16]. We show that many of the proofs from that paper can be replaced with much simpler proofs using only basic concentration and Holder's inequality, which allows us to approach this problem via SoS. As a corollary of this, we also obtain improved rates for robust mean estimation in certain regimes.

Joint work with Sam Hopkins, Cornell.

Bio: Jerry Li is a Ph.D student in CSAIL, working with Ankur Moitra. His main research interests are in developing theoretically sound machine learning algorithms for real world modern systems. So far this has lead to the study of provably secure machine learning, and the study of distributed and massively parallel learning algorithms. For his research, he has been awarded a Simons Fellowship and an NSF Fellowship.