Menu

Fast rates for the multi-armed bandit

calendar icon Oct 6, 2014 2274 views
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Since the seminal work of Lai and Robbins (1985) we know bandit strategies with normalized regret of order (i) 1/sqrt(T) for any stochastic bandit, and (ii) log(T) / T for 'benign' distributions. In Bubeck and Slivkins (2012) we designed a new strategy which extends property (i) to adversarial bandits while still having the fast rate given in (ii). I will present this algorithm and I will also discuss the possibility of even faster rates of order 1/T when extra information is available.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.