Learning and sampling under requirements

Séminaire Données et Aléatoire Théorie & Applications

3/04/2025 - 14:00 Prof Luiz Chamon Auditorium IMAG

Requirements are integral to system engineering and of growing interest in machine learning (ML) as it increasingly faces the challenges of designing AI systems rather than just components. Today, however, ML does not organically incorporate requirements, such as robustness, fairness, or compliance with prior knowledge, leading to brittle, biased, and inconsistent solutions. Instead, these statistical requirements are induced by aggregating violation metrics into the training objective. To be effective, this approach requires careful tuning hyperparameters (penalty coefficients) and cross-validation, a computationally intensive and time consuming process. Constrained learning incorporates requirements as statistical constraints rather than modifying the training objective. In this talk, I will show when and how it is possible to learn under constraints and effectively impose requirements on AI systems during training and at test time. I will introduce new non-convex duality results that yield generalization guarantees for constrained learning, showing that despite appearances, it is not harder than unconstrained learning. In fact, both have essentially the same sample complexity. I will then use these results to derive practical algorithms to tackle these problems, despite their non-convexity. Finally, I will comment on how similar results can be obtained for Markov Chain Monte Carlo (MCMC) methods to enable sampling under statistical constraints. In contrast to traditional problems involving support constraints, mirror maps, barriers, and penalties are not suited for this task. I will illustrate how these advances directly enable the design of trustworthy AI systems for, e.g., scientific applications. Ultimately, these contributions suggest how we can go beyond the current objective-centric learning paradigm towards a constraint-driven learning one.