We consider online learning in Markov decision processes with adversarial reward functions. Depending on the information available to the decision maker, we analyze two scenarios: in one setup the