Advancing Continual Learning Models for Real-World Scenarios

English

Spécialité : Image, Vision et Robotique

28/11/2025 - 14:00 Zhiqi Kang (Centre Inria de l'Université Grenoble Alpes) Grand Amphi, Inria Montbonnot

Continual learning (CL) aims to equip artificial intelligence systems with the ability to learn and adapt over time, integrating new information while retaining previously acquired knowledge. This capability is essential for real-world deployment, as we live in a world that is ever-changing. However, much of current CL research is conducted under simplified and idealized conditions—such as fully annotated and balanced datasets, clear task boundaries, and extensive training budgets—that fail to reflect the complexity and unpredictability of real-world scenarios. Consequently, models trained under such assumptions often fail to generalize beyond controlled laboratory settings, underscoring the need to study CL under more realistic and practically relevant scenarios. This thesis advances continual learning toward realistic applications to enhance its applicability in the real world. In the first part of the thesis, we address the challenge of reducing human supervision in CL, i.e., continual semi-supervised learning. We propose a learning framework, NNCSL, which is based on nearest-neighbor techniques that leverage unlabeled data to enhance representation learning while mitigating catastrophic forgetting. The proposed method substantially reduces the need for human supervision while preserving high performance, thereby improving the flexibility and scalability of continual learning models in scenarios where human annotations are difficult, costly, or time-consuming to obtain. In the second part, we advance continual learning under more realistic data distributions and training conditions. Building on the general continual learning (GCL) paradigm, we introduce MISA, a prompt-based method that exploits pretrained models with parameter-efficient tuning. MISA improves the efficiency and generalizability of prompt parameters while reducing classifier forgetting through a non-parametric logit mask. Extensive experiments demonstrate its robustness under online, blurry, and imbalanced data streams, delivering significant gains over strong baselines. In the last part of this thesis, we revisit the paradigm of knowledge acquisition by proposing Online In-Context Distillation (ICD), a training-free framework that enables continual adaptation without parameter updates. ICD combines small vision–language models on edge devices with large teacher models in the cloud, using in-context learning and automatic teacher-generated annotations to inject knowledge dynamically. This approach removes the need for human labeling, inherently avoids catastrophic forgetting, and delivers strong performance across multiple vision–language tasks, while crucially satisfying the resource and efficiency constraints of edge environments. Together, these contributions advance continual learning toward realistic scenarios by reducing supervision, relaxing unrealistic assumptions, and rethinking knowledge acquisition. The results demonstrate that CL can move beyond controlled benchmarks to become more adaptive, efficient, and practically deployable in complex, real-world environments.

Directeurs:

  • Karteek Alahari (Centre Inria de l'Université Grenoble Alpes )

Rapporteur·e·s:

  • Sarath Chandar (Univ. Montreal )
  • Simone Calderara (Univ. Modena and Reggio Emilia )

Examinateur·trice·s:

  • Frederic Jurie (Univ Caen )
  • Sophie Achard (CNRS )
  • Rahaf Aljundi (Toyoto Motor Europe )