While optimal control and reinforcement learning are fundamental frameworks for learning and control applications, their application to high dimensional control systems of the complexity of humanoid a