Bayesian Interpretations of RKHS Embedding Methods
Bayesian Interpretations of RKHS Embedding Methods
en
0.25
0.5
0.75
1.25
1.5
1.75
2
We give a simple interpretation of mean embeddings as expectations under a Gaussian process prior. Methods such as kernel two-sample tests, the Hilbert-Schmidt Independence Criterion, and kernel herding are all based on distances between mean embeddings, also known as the Maximum Mean Discrepancy (MMD). This Bayesian interpretation allows a derivation of optimal herding weights, principled methods of kernel learning, and sheds light on the assumptions necessary for MMD-based methods to work in practice. In the other direction, the MMD interpretation gives tight, closed-form bounds on the error of Bayesian estimators.