People In Motion: Pose, Action and Communication

Published on 2012-10-093639 Views

Stan Sclaroff

This talk will give an overview of some of the research in the Image and Video Computing Group at Boston University related to tracking, analysis, recognition and retrieval of images and video based o

BMVC 2012 - Surrey

Related categories

Motion and Tracking

Presentation

People in Motion: Pose, Action, and Communication00:00

Overview00:05

Pose (and matching)01:09

Human Parsing01:40

Tree vs. Non-tree Models02:12

Pairwise Appearance Constraint Helps03:27

Computational Complexity03:56

Branch-and-Bound (1)04:57

Branch-and-Bound (2)06:16

Reusing Messages06:46

O(1) Time: Range Minimum Queries07:27

Speeding Up Lower Bound Computation08:06

Multi Aspect Modeling08:19

Problem: Continuous variables09:06

Non-tree Model09:47

Linearly Augmented Tree (LAT)10:15

Optimizing Invariant Matching (1)10:44

Optimizing Invariant Matching (2)11:08

Optimizing Invariant Matching (3)11:18

Optimizing Invariant Matching (4)11:33

Mixed Integer Optimization12:09

Linear Relaxation12:34

The Solution Space13:23

Column Generation (1)14:02

Column Generation (2)14:14

Example Experiments: Matching Unreliable Regions14:56

Example Results15:12

Some Observations16:30

Action18:10

Action Recognition in Uncontrolled Videos19:07

Learning Actions from the Web20:01

Overall System21:14

Step 1: Action Image Retrieval23:48

Image Representation24:31

Incremental Dataset Collection25:25

Action Image Retrieval Results26:26

Step 2: Learning Action Pose Models27:47

Action Recognition in Videos28:34

Example Video29:33

Action Recognition In YouTube Videos30:05

Using Web Poses and Video Poses Together30:29

Ordered Pose Pairs(OPP)31:48

Exploiting Scene, Actions, and Objects32:43

Problem/Approach (1)33:46

Problem/Approach (2)36:23

Experimental Evaluation37:29

Results38:05

Visual Examples39:31

Communication41:44

Sign-Based Recognition and Retrieval42:16

Ongoing Work: Sign Lookup System42:56

The American Sign Language Lexicon Video Dataset (ASLLVD)43:28

Dataset Characteristics (1)44:04

Dataset Characteristics (2)45:16

Using Linguistic Constraints to Improve Recognition and Retrieval45:34

Handshape Bayesian Network (HSBN)46:06

Experiments48:30

Behavior Shaping Sessions Measuring Progress in Learning Signs49:08

Computational Behavioral Imaging51:15

Recap52:31

Collaborators in Work Presented52:36

Image and Video Computing Group53:08