Automated Character Annotation in Multimedia

Published on 2008-02-146496 Views

Andrew Zisserman

We describe progress in automatically identifying characters in films and TV series using their detected faces together with readily available annotation in the form of subtitles and transcripts. We d

MCVC '08 - Cannes

Related categories

Presentation

Automated Character Annotation in Multimedia00:00

The Objective - 100:18

The Objective - 200:44

Multimedia (Vision and Text) Approach01:08

The Need01:52

Outline02:57

Names and Faces in the News04:02

Weak Supervision from Text05:08

Running Example: Use Episodes from Buffy the Vampire Slayer05:18

Textual Annotation: Subtitles/Closed-Captions05:56

Textual Annotation: Script06:33

Alignment by Dynamic Time Warping07:07

Subtitle/Script Alignment07:18

Virtually Free Source of Annotation08:01

Ambiguity08:27

Face Representation and Matching10:23

Why This is Difficult: Uncontrolled Viewing Conditions10:28

Matching Faces - 110:55

Matching Faces - 211:09

The Benefits of Video12:06

Three Steps12:39

Obtaining Sets of Faces Using Tracking within Shots12:56

Face Detection12:57

"Tracking" by Face Detection13:29

Face Association13:44

Connecting Face Detections Temporally14:21

Example Face Tracks14:56

Face Vector Representation15:22

Matching Faces15:23

Detect Face Features for Rectification15:51

Eyes/Nose/Mouth Detectors16:08

Constellation Like Appearance/Shape Model16:15

Face Normalization16:24

Representing Faces17:21

SIFT Descriptor17:31

Face Feature Vector - Summary17:44

Matching Face Sets - 117:58

Matching Face Sets - 218:00

Matching Face Sets - 318:12

Matching Face Sets within a Shot18:28

Example: Buffy the Vampire Slayer18:51

Raw Face Detections20:04

Face Tubes (Tracking Only)20:37

Intra-Shot Matching21:15

Ambiguity Again22:24

Speaker Detection - 123:01

Speaker Detection - 223:20

Correct "Non-Speaking" Classifications24:15

Error in Speaker Classification24:37

Resolved Ambiguity24:58

Semi-Supervised Learning25:40

Exemplar Extraction26:07

Classification by Exemplar Sets26:37

"Refusal to Predict"27:22

Experiments27:54

Example Results - 128:18

Example Results - 228:39

Precision/Recall28:48

Example Video29:33

Quantitative Results31:08

Using an SVM Classifier – Noisy Labels31:33

Classification Results (Inter-Episode)32:45

Extensions32:47

Improving Coverage – Beyond Frontal Faces32:48

Feature Localization & Speaker Detection32:58

Profile Speaker Detection33:08

Summary and Extensions33:41