Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities
published: July 19, 2010, recorded: June 2010, views: 5320
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Detecting objects in cluttered scenes and estimating articulated human body parts are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g. playing tennis), where the relevant object tends to be small or only partially visible, and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other – recognizing one facilitates the recognition of the other. In this paper we propose a new random field model to encode the mutual context of objects and human poses in human-object interaction activities. We then cast the model learning task as a structure learning problem, of which the structural connectivity between the object, the overall human pose, and different body parts are estimated through a structure search approach, and the parameters of the model are estimated by a new max-margin algorithm. On a sports data set of six classes of human-object interactions , we show that our mutual context model significantly outperforms state-of-theart in detecting very difficult objects and human poses.
Download slides: cvpr2010_fei_fei_mmco_01.pdf (1.7 MB)
Download slides: cvpr2010_fei_fei_mmco_01.v1.pdf (2.5 MB)
Download slides: cvpr2010_fei_fei_mmco_01.ppt (13.3 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !