Towards Semantic Embedding in Visual Vocabulary
published: July 19, 2010, recorded: June 2010, views: 424
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Visual vocabulary serves as a fundamental component in many computer vision tasks, such as object recognition, visual search, and scene modeling. While state-of-the-art approaches build visual vocabulary based solely on visual statistics of local image patches, the correlative image labels are left unexploited in generating visual words. In this work, we present a semantic embedding framework to integrate semantic information from Flickr labels for supervised vocabulary construction. Our main contribution is a Hidden Markov Random Field modeling to supervise feature space quantization, with specialized considerations to label correlations: Local visual features are modeled as an Observed Field, which follows visual metrics to partition feature space. Semantic labels are modeled as a Hidden Field, which imposes generative supervision to the Observed Field with WordNet-based correlation constraints as Gibbs distribution. By simplifying the Markov property in the Hidden Field, both unsupervised and supervised (label independent) vocabularies can be derived from our framework. We validate our performances in two challenging computer vision tasks with comparisons to state-of-the-arts: (1) Large-scale image search on a Flickr 60,000 database; (2) Object recognition on the PASCAL VOC database.
Download slides: cvpr2010_ji_tsev_01.v1.pdf (3.3 MB)
Download slides: cvpr2010_ji_tsev_01.ppt (7.4 MB)
Download article: cvpr2010_ji_tsev_01.pdf (2.8 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !