Menu

Dynamic Bayesian Networks for Multimodal Interaction

calendar icon Feb 25, 2007 10537 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Dynamic Bayesian networks (DBNs) offer a natural upgrade path beyond classical hidden Markov models and become especially relevant when temporal data contains higher order structure, multiple modalities or multi-person interaction. We describe several instantiations of dynamic Bayesian networks that are useful for modeling temporal phenomena spanning audio, video and haptic channels in single, two-person and multi-person activity. These models include input-output hidden Markov models, switched Kalman filters and, most generally, dynamical systems trees (DSTs). These models are used to learn audio-video interaction in social activities, video interaction in multi-person game playing and haptic-video interaction in robotic laparoscopy. Model parameters are estimated from data in an unsupervised setting using generalized expectation maximization methods. Subsequently, these models can predict, synthesize and classify various types of rich multimodal human activity. Experiments in gesture interaction, audio-video conversation, football game playing and surgical drill evaluation are shown.

RELATED CATEGORIES

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.