Statistical Learning Theory

Main contact(s)	Michel Barret
UE	SM11	Credits	2 Coef.
Lectures	10.5 hr	Tutorials	10.5 hr
Labworks	0 hr	Exam	2 hr

Presentation

The objective of supervised learning is to propose methods that, based on a training set of examples, make a decision on a parameter based on observations, the decision being the best possible on average. For example, classify images according to their content, i.e. decide if an image represents a cat, a dog, or something else. We will formally present the problem and study the guarantees of generalization of supervised learning algorithms, i.e. the quality of prediction of the output associated with an entry not present in the training set. To achieve this objective, we will introduce the concepts of hypothesis space with PAC (probably approximately correct) learning capacity , Vapnik-Chervonenkis dimension of a hypothesis space. Finally, depending on the time available, we will present Olivier Catoni’s point of view on the PAC Bayesian bounds for the deviations between empirical risk and real risk.

Learning outcomes

At the end of this course, students will be able:

to understand elements of the theory of supervised learning;
to understand the bias-complexity trade-off of an hypothesis class;
to understand and use PAC bayesian bounds of supervised learning (in particular those of binary classification problem).

Syllabus

Formalization of supervised learning problems

PAC learning capacity and uniform convergence
The bias-complexity trade-off
The VC (Vapnik-Chervonenkis) dimension of a hypothesis space
Two fundamental theorems of PAClearning
PAC-Bayes learning bounds