A Primer on PAC-Bayesian Learning

Long Beach, CA, USA - June 10, 2019


PAC-Bayesian inequalities were introduced by McAllester (1998, 1999), following earlier remarks by Shawe-Taylor and Williamson (1997). The goal was to produce PAC-type risk bounds for Bayesian-flavored estimators. The acronym PAC stands for Probably Approximately Correct and may be traced back to Valiant (1984). This framework allows to consider not only classical Bayesian estimators, but rather any randomized procedure from a data-dependent distribution.

Over the past few years, the PAC-Bayesian approach has been applied to numerous settings, including classification, high-dimensional sparse regression, image denoising and reconstruction of large random matrices, recommendation systems and collaborative filtering, binary ranking, online ranking, transfer learning, multiview learning, signal processing, physics, to name but a few. The "PAC-Bayes" query on arXiv illustrates how PAC-Bayes is quickly re-emerging as a principled theory to efficiently address modern machine learning topics, such as leaning with heavy-tailed and dependent data, or deep neural networks generalisation abilities.


The tutorial aims at providing the ICML audience with a comprehensive overview of PAC-Bayes, starting from statistical learning theory (complexity terms analysis, generalisation and oracle bounds) and covering algorithmic (actual implementation of PAC-Bayesian algorithms) developments, up to the most recent PAC-Bayesian analyses of deep neural networks generalisation abilities. The PAC-Bayesian framework is the backbone to several influential contributions to statistical learning theory and deep learning and we believe it is time to address again this theory in a full tutorial. We intend to address the largest audience, with an elementary background in probability theory and statistical learning, although all key concepts will be covered from scratch.


Benjamin Guedj is a tenured research scientist at Inria (France) and a senior research scientist at University College London (UK). His main research areas are statistical learning theory, PAC-Bayes, machine learning and computational statistics. Benjamin Guedj obtained a PhD from Université Pierre et Marie Curie (France) in 2013.

John Shawe-Taylor is a professor at University College London (UK) where he is Director of the Centre for Computational Statistics and Machine Learning (CSML). His main research area is statistical learning theory, with contributions to neural networks, machine learning, and graph theory. John Shawe-Taylor obtained a PhD in Mathematics at Royal Holloway, University of London in 1986. He has published over 150 research papers, and his pioneering work has initiated the PAC-Bayesian theory, to which he has made many later contributions. He has coordinated a number of European wide projects investigating the theory and practice of machine learning.


Keywords: Statistical learning theory, PAC-Bayes, machine learning, computational statistics

Slides are available here.

Videos are available here: Part 1 Part 2