We need a BIT more GUTS (Grand Unified Theory of Statistics)
A remarkable variety of problems in machine learning and statistics can be recast as data compression under constraints: (1) sequential prediction with arbitrary loss functions can be transferred to equivalent log loss (data compression) problems. The worst-case optimal regret for the original loss is determined by Vovk’s mixability, which in fact measures how many bits we lose if we are not allowed to use mixture codes in the compression formulation. (2) in classification, we can map each set of candidate classifiers C to a corresponding probability model M. Tsybakov’s condition (which determines the optimal convergence rate) turns out to measure how much more we can compress data by coding it using the convex hul of M rather than just M. (3) hypothesis testing in the applied sciences is usually based on p-values, a brittle and much-criticized approach. Berger and Vovk independently proposed calibrated p-values, which are much more robust. Again we show these have a data compression interpretation. (4) Bayesian nonparametric approaches usually work well, but fail dramatically in Diaconis and Freedman’s pathological cases. We show that in these cases (and only in these) the Bayesian predictive distribution does not compress the data. We speculate that all this points towards a general theory that goes beyond standard MDL and Bayes.
RELATED CATEGORIES
MORE VIDEOS FROM THE EVENT
Opening remarks
Stefan Harmeling
Jan 23, 2012 3051 views
Anatomy of a Learning Problem
Mark Reid
Jan 25, 2012 4288 views
Efficient Market Making via Convex Optimization & Connection to Online Learning
Jenn Wortman Vaughan
Jan 25, 2012 4653 views
Degrees of Supervision
Dario Garcia Garcia
Jan 25, 2012 3592 views
Machine Learning Markets
Amos Storkey
Jan 25, 2012 6205 views
MORE VIDEOS FROM THE SAME CATEGORIES
Non-Redundant Subgroup Discovery Using a Closure System
Mario Boley
Oct 20, 2009 3531 views
Overview of the Challenge and Results
Mark Everingham
Feb 25, 2007 3396 views
TP1 - Leveraging Complex Knowledge
Neil D. Lawrence
May 18, 2009 2936 views
Coming Good and Breaking Bad: Generating Transformative Character Arcs For Use i...
Tony Veale
Aug 8, 2014 2075 views
The next steps after UCI - mldata.org
Sören Sonnenburg
Jul 20, 2010 4344 views