Menu

Binary Embedding: Fundamental Limits and Fast Algorithm

calendar icon Dec 5, 2015 1740 views
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Binary embedding is a nonlinear dimension reduction methodology where high dimensional data are embedded into the Hamming cube while preserving the structure of the original space. Specifically, for an arbitrary N distinct points in Sp−1, our goal is to encode each point using m-dimensional binary strings such that we can reconstruct their geodesic distance up to δ uniform distortion. Existing binary embedding algorithms either lack theoretical guarantees or suffer from running time O(mp). We make three contributions: (1) we establish a lower bound that shows any binary embedding oblivious to the set of points requires m=Ω(1δ2logN) bits and a similar lower bound for non-oblivious embeddings into Hamming distance; (2) we propose a novel fast binary embedding algorithm with provably optimal bit complexity m=O(1δ2logN) and near linear running time O(plogp) whenever logN≪δp√, with a slightly worse running time for larger logN; (3) we also provide an analytic result about embedding a general set of points K⊆Sp−1 with even infinite size. Our theoretical findings are supported through experiments on both synthetic and real data sets.

RELATED CATEGORIES

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.