Expectation-Propagation performs smooth gradient descent
Séminaire Probabilités & Statistique
22/06/2017 - 14:00 Guillaume DEHAENE (EPFL) Salle 106 - Batiment IMAG
In most applications of Bayesian inference, the key object: the posterior probability distribution, is uncomputable. One possibility to deal with this issue consists in computing a parametric (e.g: Gaussian) approximation of the true uncomputable distribution, for example using Expectation Propagation (EP, Minka, 2001) or Stochastic Variational Inference (SVI, Hoffman, 2013). Both proceed by iteratively improving a parametric approximation of the true posterior distribution until a fixed-point is reached. Both of these algorithms have now existed for some time, but they are still slightly poorly understood. In this talk, I will shed new intuitive light on the behavior of the sequence of approximations of EP and SVI when they are used to compute a Gaussian approximation. I will do this by linking them both to gradient descent (more precisely, Newton's method: gradient descent with a Hessian correction). I will show that both EP and SVI can be understood as performing gradient descent on a smoothed version of the energy landscape.