We consider the problem of multi-task learning, i.e. the setup where there are multiple related prediction problems (tasks), and we seek to improve predictive performance by sharing information across