Deep Learning entering the post-protein structure prediction era

Seminar Données et Aléatoire Théorie & Applications

6/01/2022 - 14:00 Sergeï Grudinin (Inria Grenoble - NanoD) Salle 106

Last year has seen a breakthrough in structural bioinformatics - the long-standing problem of three-dimensional protein structure prediction from its sequence was solved by deep learning-based methods, most notably Google DeepMind's AlphaFold2. This success comes from advances transferred from several machine-learning areas, including computer vision and natural language processing. At the same time, the bioinformatics community has developed methods specifically designed to deal with protein sequences and structures, and their representations. Novel emerging approaches among others include (i) geometric learning, i.e., learning on non-regular representations such as graphs, 3D Voronoi tessellations, and point clouds; (ii) pre-trained protein language models leveraging attention; (iii) equivariant architectures preserving the symmetry of 3D space; (iv) combining protein representations; (v) and truly end-to-end architectures, i.e., single differentiable models starting from a sequence and returning a 3D structure. In this seminar, I will overview this progress, briefly present contributions from our team, and also outline current and future challenges in the field.