Menu

Tradeoffs in online learning under partial information feedback

calendar icon Jan 16, 2013 2866 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

How should an online learner choose its actions to trade off between exploration and exploitation to maximize the accuracy of predictions where the choice of actions directly influence what information the learner receives? First, using the abstract framework of partial monitoring, we provide a full answer to this question for any discrete prediction problems: As it turns out, the difficulty at the optimal tradeoff depends on a novel, yet intuitive geometric-algebraic condition. We also discuss tradeoffs and open problems concerning adaptation to benign environments, predictions with side-information, a specific problem when the learner needs to pay for accessing the feature values and the label, and the influence of delays in receiving the feedback.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.