Probability distributions and likelihood functions as compositional data: bayesian updating, calibration and discriminant analysis

français

Seminar Données et Aléatoire Théorie & Applications

17/10/2024 - 14:00 Paul-Gauthier Noé Salle 106

In the binary case, i.e. when there are only to exhaustive and mutually exclusive hypotheses, the Bayes' rule can be written as a sum between the log-ratio of prior probabilities and the log-likelihood-ratio (LLR). After discussing some calibration properties of the latter, we will see how they can be used to design a new discriminant analysis where the discriminant component forms a calibrated LLR.
However, the additive form of the Bayes' rule have been considered as nonextensible beyond the binary case making non trivial the extension of the above results to cases where there are more than two hypotheses. We will see how the Aitchison geometry of the simplex, coming from the field of compositional data analysis, overcomes this nonextensibility.
Indeed, in a probability distribution, what really matters is the value of the probabilities in relation to the others rather than a single value itself; hence the use of odds and ratios of probabilities. Such data—where the relative information between the values is what really matters—is known as compositional data. In compositional data analysis, the Aitchison geometry of the simplex has been proposed to deal with this type of data. We will see how this geometry allows us to recover the additive form of the Bayes rule, extending—to the non-binary case—the concept of LLR. We will see how the calibration properties of the LLRs naturally extend to higher dimensional likelihood functions and how the proposed discriminant analysis scales beyond the binary case.
More generally, we hope to show that compositional data analysis can provide an original and promising point of view to the fields of machine learning and statistics.