Segmentation uncertainty in multiple change-point models


Séminaire Probabilités & Statistique

16/01/2014 - 14:00 Yann GUÉDON (CIRAD, Équipe Inria Virtual Plants, Montpellier) Salle 1 - Tour IRMA

We address the retrospective or off-line multiple change-point detection problem. In this context, there is a need of efficient diagnostic tools that enable to localize the segmentation uncertainty along the observed sequence. Concerning the segmentation uncertainty, the focus was mainly on the change-point position uncertainty. We propose to state this problem in a new way, viewing multiple change-point models as latent structure models and using results from information theory. This led us to show that the segmentation uncertainty is not reflected in the posterior distributions of the change-point position because of the marginalization that is intrinsic in the computation of these posterior distributions. The entropy of the segmentation of a given observed sequence can be considered as the canonical measure of segmentation uncertainty. This segmentation entropy can be decomposed as conditional entropy profiles that enables to localize this canonical segmentation uncertainty along the sequence. One of the main outcomes of this work is to derive efficient algorithms to compute these conditional entropy profiles. The proposed approach benefits from all the properties of the Shannon-Khinchin axioms of entropy and therefore is the unique approach for localizing the canonical segmentation uncertainty along the sequence. We introduce the Kullback-Leibler divergence of the uniform distribution from the segmentation distribution for successive numbers of change points as a new tool for assessing the number of change points selected by different methods. The proposed approach is illustrated using contrasted examples.

Keywords: Entropy; Kullback-Leibler divergence; Latent structure model; Multiple change-point detection; Smoothing algorithm.

Guédon, Y. (2013). Segmentation uncertainty in multiple change-point models. Statistics and Computing, in Press.