First-order regret bounds for combinatorial semi-bandits

We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total

RELATED CATEGORIES

First-order regret bounds for combinatorial semi-bandits

Gergely Neu

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES