Clustering, community detection, and computation-information gaps
Séminaire Données et Aléatoire Théorie & Applications
22/05/2025 - 14:00 Nicolas Verzelen Salle JM Chassery (GIPSA-lab)
This talk is devoted to clustering and community detection which are among the most emblematic problems in unsupervised learning. I will introduce toy models, namely sub-Gaussian Mixture Model (sGMM) and Stochastic Block Model (SBM), describe their fundamental limits, and explain the intuition behind popular polynomial-time procedures. Interestingly, these models seem to exhibit computation-information gaps: no polynomial-time procedure is known to achieve the optimal statistical performances. In a second part, I will present recent approaches for providing evidence of computational hardness for such statistical tasks. A particular emphasis will be given to the low-degree polynomial framework.