Analysis of heavy rainfall in high dimensions


Séminaire Probabilités & Statistique

7/02/2013 - 14:00 Philippe Naveau (Laboratoire des Sciences du Climat et l'Environnement (LSCE) CNRS Saclay) Salle 1 - Tour IRMA

One of the main objectives of statistical climatology is to extract relevant information hidden in complex spatial-temporal climatological datasets. In impact studies, heavy rainfall are of primary importance for risk assessment linked to floods and other hydrological events. At an hourly time scale, precipitation distributions often strongly differ from Gaussianity. To identify spatial patterns, most well-known statistical techniques are based on the concept of intra and inter clusters variances (like the k-means algorithm or PCA's) and such approaches based on deviations from the mean may not be the most appropriate strategy in our context of studying rainfall extremes. One additional difficulty resides in the dimension of climatological databases of hourly recordings that may gather measurements from hundreds or even thousands of weather stations during many decades. A possible avenue to fill up this methodological gap resides in taking advantage of multivariate extreme value theory, a well-developed research field in probability, and to adapt it to the context of spatial clustering. In this talk, we propose and study two step algorithm based on this plan. Firstly, we adapt a Partitioning Around Medoids (PAM) clustering algorithm proposed by Kaufman to weekly maxima of hourly precipitation. This provides a set of homogenous spatial clusters of extremes of reasonable dimension. Secondly, we fine-tune our analysis by fitting a Bayesian Dirichlet mixture model for multivariate extremes within each cluster.
We compare and discuss our approach throughout the analysis of hourly precipitation recorded in France (Fall season, 92 stations, 1993-2011).

This is a joint work with A. Sabourin and E. Bernard.