AppStat

Statistical learning

Description: The objective of supervised learning is to propose methods that, based on a training set of examples, make a decision on a parameter based on observations, the decision being the best possible on average. For example, classify images according to their content, i.e. decide if an image represents a cat, a dog, or something else. We will formally present the problem and study the guarantees of generalization of supervised learning algorithms, i.e. th3e quality of prediction of the output associated with an entry not present in the training set. To achieve this objective, we will introduce the concepts of hypothesis space with PAC (probably approximately correct) learning capacity , Vapnik-Chervonenkis dimension of a hypothesis space. We will state and prove two fundamental theorems of supervised learning theory giving a lower bound and an upper bound of the real risk to the binary classification problem.

Content: Formalization of supervised learning problems PAC learning capacity and uniform convergence The bias-complexity trade-off The VC (Vapnik-Chervonenkis) dimension of a hypothesis space Two fundamental theorems of PAC learning

Learning outcomes: At the end of this course, students will be able: -to understand elements of the theory of supervised learning; -to understand the bias-complexity trade-off of an hypothesis class; -to understand and use PAC bayesian bounds of supervised learning (in particular those of binary classification problem).

Teaching methods: 10,5h of courses + 10,5h of tutorials + written exam of 2h

Means: The tutorials (TDs), consisting of exercises, will allow the concepts seen in class to be used.

Evaluation methods: Written exam of 2h with documents

Evaluated skills:

Analyze, design, and build complex systems with scientific, technological, human, and economic components

Course supervisor: Michel Barret

Geode ID: 3MD4140