Metagenomics for human health: how to extract information from a mixture of noisy data?

English

Séminaire Probabilités & Statistique

22/11/2018 - 14:00 Mr Clovis Galiez (LJK - UGA) Salle 106 - Batiment IMAG

Metagenomics allows to access the genomic information of organisms directly sampled from their natural habitat for less than a thousand Euros per sample. Not only being an affordable technique it most importantly circumvents cultivation of the organism in wet labs, revealing in the last decade the enormous biodiversity of microbiomes. It gained in particular a lot of interest by unveiling the composition of the human gut and thereby shedding light on an unanticipated link between important diseases (e.g. autism and Parkinson's) and the gut microbiota. Extracting information from metagenomic data comes at a cost: its size and intrinsic mixture of genes and organisms call for new computational developments in a big data context. We will give a flavor of the metagenomic data and its usual processing steps. Through examples of some current limitations we will show how it raises new questions for unsupervised classification. We will finally open to machine learning challenges laying ahead in order to extract information relevant to human health.