Computationally fast targeted learning using adaptive survey sampling

Séminaire Probabilités & Statistique

7/11/2019 - 14:00 Mr Antoine Chambaz (MAP5) Salle 106 - Batiment IMAG

Abstract: We address the practical construction of asymptotic confidence intervals (CIs) for smooth, real-valued statistical parameters by targeted learning from iid data in contexts where sample size is so large that it poses computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is then carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the CIs. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with an example where the parameter of interest is a variable importance measure of a continuous exposure on an outcome. We also conduct a simulation study and comment briefly on its results. This talk is based on joint works with P. Bertail, E. Joly and X. Mary.