Structured Models for Action Recognition in Real-world Videos

français

Speciality : Mathématiques et Informatique

25/10/2012 - 15:00 Mr Adrien Gaidon (Université de Grenoble) Salle A103 de l'INRIA Rhône-Alpes, Montbonnot

Keywords :
  • Video Analysis
  • Computer Vision
  • Machine Learning
This dissertation introduces novel models to recognize broad action categories, like "opening a door" and "running", in real-world video data such as movies and Internet videos. In particular, we investigate how an action can be decomposed, what is its discriminative structure, and how to use this information to accurately represent video content. The main challenge we address lies in how to build models of actions that are simultaneously information-rich (in order to correctly differentiate between different action categories)  and robust to the large variations in actors, actions, and videos present in real-world data. We design three robust models capturing both the content of and the relations between action parts. Our approach consists in organizing collections of robust local features into structured action representations, for which we propose efficient kernels. Even if they share the same underlying principles, our methods differ in terms of the type of problem they address and the structural information they rely on. In all three cases, we conducted thorough experiments on real-world videos from challenging benchmarks used by the action recognition community. We show that our methods outperform the related state of the art, thus highlighting that using structure information allows for more accurate and robust action recognition in real-world videos.

Directors:

  • Mme Cordelia Schmid (Chercheur - INRIA )

Raporteurs:

  • Mr Martiel Hebert (CMU - USA )

Examinators:

  • Mr Zaïd Harchaoui (Chercheur - INRIA )
  • Mr Ivan Laptev (Chercheur - INRIA - Paris )