published: Feb. 25, 2007, recorded: May 2004, views: 4608
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
A simple and efficient way to model much image and video data is to decompose it into a set of 2-dimensional objects in layers. Each object is characterized by its shape and appearance (as with a "sprite" in computer graphics). Following earlier work on layer decompositions in computer vision (e.g. Wang and Adelson, 1994), Frey and Jojic (1999) stated the sprite-learning problem in terms of transformation-invariant clustering using mixture models and EM. This was later extended (Jojic and Frey, 2001) to learning multiple sprites/objects from a video sequence. The approach of building in knowledge about allowable transformations into the clustering algorithm is an important way that a machine learning algorithm (clustering) needs to be tailored to the computer vision domain. Frey and Jojic's approach to learning multiple sprites uses variational inference simultaneously on all sprites; we also discuss recent work by Williams and Titsias (2004) who describe a greedy sequential algorithm for this task.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !