Menu

A PAC-Bayesian Analysis of Dropouts

calendar icon Oct 6, 2014 2111 views
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Intuitively, a neural network that is robust to dropout perturbations should have better generalization properties - it should perform better on novel inputs. Stochastic model perturbation is the fundamental concept underlying PAC-Bayesian generalization theory. This talk will briefly summarize PAC-Bayesian generalization theory and give a regularization bound for a simple form of dropout training as a straightforward application. For a regularization bound involving an L2 penalty for model weights, dropouts reduce the regularization penalty by a factor of 1-alpha where alpha is the dropout rate. The bound then expresses a trade-off between the dropout rate and the training loss. While this regularization bound in intriguing, it may not be the right analysis. An alternative analysis involves variance reduction - the standard motivation for bagging. There are good reasons to believe that a certain general PAC-Bayes variance bound is significantly tighter than the general PAC-Bayes regularization bound. Unfortunately the variance bound is opaque - it does not involve explicit regularization and is difficult to compare with regularization bounds. Also, unlike regularization bounds, there is no obvious method for designing algorithms that minimize the variance bound. A compelling variance-based PAC-Bayesian analysis of dropouts remains an open problem.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.