LJKD.A.T.A. Seminar

On Thursday November 26 2020 at 14h00 in waiting for a room

Seminary of Mr Yury POLYANSKIY (MIT)

Selfregularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models

Summary

Introduced by Kiefer and Wolfowitz 1956, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture models and empirical Bayes estimation. Sidestepping the nonconvexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of over parameterization.
In this work we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complexanalytic techniques, we show that with high probability the NPMLE based on a sample of size n has O(\\log n) atoms (mass points), significantly improving the deterministic upper bound of n due to Lindsay (1983). Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with O(\\log n) components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term selfregularization. Extensions to other exponential families are given. As a statistical application, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures, recovering a result of Zhang (2009). Time permitting, we will discuss connections to approaching the optimal regret in empirical Bayes. This is based on joint work with Yihong Wu (Yale).
Lien vidÃ©o : https://cloudljk.imag.fr/index.php/s/59wk7yicec27Ege
