Breakpoint and atypical region detection based on local score distribution. Applications in diverse domains.

English

Séminaire Données et Aléatoire Théorie & Applications

27/03/2020 - 14:00 Mme Sabine Mercier (Université de Toulouse 2 Jean Jaurès) Salle 106 - Batiment IMAG

In biological sequence analysis, a real value called score is assigned to each component of a sequence. These scores can for example reflect a physico-chemical property of the component which depends on the studied context. The local score is defined as the maximal cumulated score over every segment, at any position and with every possible length. The goal is to highlight regions of the sequences for which, the local score is significantly high.

In a first part, I will present the historical motivation of the local score. The diverse theoretical results on its distribution, and the probabilistic tools used to establish them, will be recall. Some examples of applications in molecular biology and pharmaco surveillance will be presented. A second part will deal on questions surrounding the local score approach: Learning scoring functions; difference between sliding windows, scan statistics and local score will be detailed and possible relations proposed. I will also talk about suboptimal segments and multiple tests. Application to Statistical Process Control (SPC), or signal monitoring will be presented. A control chart based on the local score will be proposed and its performance evaluated.