People In Motion: Pose, Action and Communication

This talk will give an overview of some of the research in the Image and Video Computing Group at Boston University related to tracking, analysis, recognition and retrieval of images and video based on humans and their actions. First, efficient methods for inference of human pose will be presented. Linearly augmented tree models are proposed that enable efficient scale and rotation invariant matching. In another approach, articulated pose estimation with loopy graph models is made efficient via a branch-and-bound strategy for finding the globally optimal pose. Second, methods for learning human action models from Web images and video will be presented; the methods require no human intervention other than the action keywords to be used to form text queries to Web image and video search engines. A Multiple Instance Learning framework for exploiting properties of the scene, objects, and humans in video is also proposed. Third, work towards automatic recognition and retrieval of American Sign Language (ASL) in video databases will be presented. The goal is to enable users to search ASL video content simply by video-recording a query sign and relying on computer-based sign-recognition for lookup.

RELATED CATEGORIES

MOTION AND TRACKING

People In Motion: Pose, Action and Communication

Stan Sclaroff

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES