Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions
published: July 19, 2010, recorded: June 2010, views: 2098
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Psychologists have proposed that many human-object interaction activities form unique classes of scenes. Recognizing these scenes is important for many social functions. To enable a computer to do this is however a challenging task. Take people-playing-musical-instrument (PPMI) as an example; to distinguish a person playing violin from a person just holding a violin requires subtle distinction of characteristic image features and feature arrangements that differentiate these two scenes. Most of the existing image representation methods are either too coarse (e.g. BoW) or too sparse (e.g. constellation models) for performing this task. In this paper, we propose a new image feature representation called “grouplet”. The grouplet captures the structured information of an image by encoding a number of discriminative visual features and their spatial configurations. Using a dataset of 7 different PPMI activities, we show that grouplets are more effective in classifying and detecting human-object interactions than other state-of-theart methods. In particular, our method can make a robust distinction between humans playing the instruments and humans co-occurring with the instruments without playing.
Download slides: cvpr2010_yao_gsir_01.v1.pdf (3.0 MB)
Download slides: cvpr2010_yao_gsir_01.ppt (6.3 MB)
Download article: cvpr2010_yao_gsir_01.pdf (1012.4 KB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !