Training Structured Predictors for Novel Loss Functions

author: David McAllester, Toyota Technological Institute at Chicago
published: Jan. 19, 2010,   recorded: December 2009,   views: 3453


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


As a motivation we consider the PASCAL image segmentation challenge. Given an image and a target class, such as person, the challenge is to segment the image into regions occupied by objects in that class (person foreground) and regions not occupied by that class (non-person background). At the present state of the art the lowest pixel error rate is achieved by predicting all background. However, the challenge is evaluated with an intersection over union score with the property that the all-background prediction scores zero. This raises the question of how one incorporates a particular loss function into the training of a structured predictor. A standard approach is to incorporate the desired loss into the structured hinge loss and observe that, for any loss, the structured hinge loss is an upper bound on the desired loss. However, this upper bound is quite loose and it is far from clear that the structured hinge loss is an appropriate or useful way to handle the PASCAL evaluation measure.

This talk reviews various approaches to this problem and presents a new training algorithm we call the good-label-bad-label algorithm. We prove that in the data-rich regime the good-label-bad-label algorithm follows the gradient of the training loss assuming only that we can perform inference in the given graphical model. The algorithm is structurally similar to, but significantly different from, stochastic subgradient descent on the structured hinge loss (which does not follow the loss gradient).

See Also:

Download slides icon Download slides: nipsworkshops09_mcallester_tspnlf_01.pdf (133.6┬áKB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: