On-line Learning of Wide-domain Spoken Dialogue Systems

The first part of the talk reviews the general structure of limited domain statistical SDS and then explains how a collection of limited domain systems can be merged using the framework of Bayesian Committee Machines. The problem of reward estimation in on-line learning is then introduced and a solution based on the joint estimation of Gaussian Process based reward prediction and dialogue policy is presented.