Analyse du contenu des images à grande échelle et une nouvelle approche de l'apprentissage Zéro-Shot


Speciality : Informatique

6/01/2014 - 14:30 Mr Zeynep Akata (Université de Grenoble) Grand Amphi de l'INRIA Rhône-Alpes, Montbonnot

Keywords :
  • Label Embedding
  • Attributes
  • Linear SVMs
  • Stochastic Gradient Descent
  • Zero-Shot Learning
  • Few-Shots Learning
  • Classification d'image à grande échelle
  • Séparatrices à Vastes Marges linéaires
  • Descente de gradient stochastique
  • Incorporation d'étiquettes
  • apprentissage "Zero-shot"
  • apprentissage "few-shots"
Building algorithms that classify images on a large scale is an essential task due to the difficulty in searching massive amount of unlabeled visual data available on the Internet. We aim at classifying images based on their content to simplify the manageability of such large-scale collections. Large-scale image classification is a difficult problem as datasets are large with respect to both the number of images and the number of classes. Some of these classes are fine grained and they may not contain any labeled representatives. In this thesis, we use state-of-the-art image representations and focus on efficient learning methods. Our contributions are (1) a benchmark of learning algorithms for large scale image classification, and (2) a novel learning algorithm based on label embedding for learning with scarce training data.
Firstly, we propose a benchmark of learning algorithms for large scale image classification in the fully supervised setting. It compares several objective functions for learning linear classifiers such as one-vs-rest, multiclass, ranking and weighted average ranking using the stochastic gradient descent optimization. The output of this benchmark is a set of recommendations for large-scale learning. We experimentally show that, online learning is well suited for large-scale image classification. With simple data rebalancing, One-vs-Rest performs better than all other methods. Moreover, in online learning, using a small enough step size with respect to the learning rate is sufficient for state-of-the-art performance. Finally, regularization through early stopping results in fast training and a good generalization performance. 
Secondly, when dealing with thousands of classes, it is difficult to collect sufficient labeled training data for each class. For some classes we might not even have a single training example. We propose a novel algorithm for this zero-shot learning scenario. Our algorithm uses side information, such as attributes to embed classes in a Euclidean space. We also introduce a function to measure the compatibility between an image and a label. The parameters of this function are learned using a ranking objective. Our algorithm outperforms the state-of-the-art for zero-shot learning. It is flexible and can accommodate other sources of side information such as hierarchies. It also allows for a smooth transition from zero-shot to few-shots learning.


Mr Georges Quenot (Directeur de Recherche - CNRS)


  • Mr Florent Perronnin (Directrice de Recherche - XRCE )
  • Mme Cordélia Schmid (Directeur de Recherche - INRIA )


  • Mr Christoph Lampert (Professeur - IST Austria, Vienne, Autriche )
  • Mr Matthieu Cord (Professeur - LIP6-Université SorbonneParis )


  • Mr Georges Quenot (Directeur de Recherche - CNRS )
  • Mr Vittorio Ferrari (Professeur - University of Edinbourgh, Edinbourgh, United Kingdom )