Where machine vision needs help from machine learning

Published on 2011-08-0210594 Views

William T. Freeman

I'll describe where computer vision needs advances from computer science and machine learning. This talk will cover where computer vision works well: finding cars and faces, operating in controlled e

COLT 2011 - Budapest

Related categories

Presentation

Where machine vision needs help from machine learning00:00

Outline - 100:06

The Taiyuan University of Technology Computer Center staff, and me (1987) - 100:40

The Taiyuan University of Technology Computer Center staff, and me (1987) - 200:57

Me and my wife, riding from the Foreigners’ Cafeteria01:02

Me in my office at the Computer Center01:10

VISION01:23

Goal of computer vision01:35

Some particular goals of computer vision02:02

Companies and applications02:42

COGNEX02:57

POSEIDON03:26

Natatorium03:41

Saved by a computer lifeguard03:44

Mobil Eye03:58

1998 Journal publication04:42

Sony EyeToy, 200305:01

Microsoft Kinect, 201005:13

identix06:04

Google Image Search, Google Goggles - 106:20

Google Image Search, Google Goggles - 206:33

The Computer Vision Industry07:07

Games and Gesture Recognition07:17

Industrial automation and inspection07:21

Object Recognition for Mobile Devices07:24

Three-dimensional modeling07:52

Outline - 209:16

Outline - 309:30

What makes computer vision hard? - 109:33

What makes computer vision hard? - 209:39

What makes computer vision hard? - 309:47

intra - class variation10:12

Object recognition issues10:36

Computer vision features over time11:21

What everyone looked like back then11:32

Features11:40

Objects11:46

Computer vision research results, 198611:57

Computer vision research results, 199212:37

Back to the present ...12:57

What has allowed us to make progress? - 113:15

What has allowed us to make progress? - 214:22

CVPR 2003 Tutorial - 114:29

CVPR 2003 Tutorial - 214:49

Invariant Local Features15:09

Hand under two different lighting conditions16:27

SIFT vector formation - 117:06

SIFT vector formation - 217:30

Feature stability to noise19:04

Feature stability to affine change19:38

Distinctiveness of features20:00

Figure 1220:34

Figure 1321:15

Building a Panorama - 121:39

Building a Panorama - 221:45

These feature point detectors and descriptors are the most important recent advance in computer vision and graphics21:56

S5ll another use for SIFT features22:28

Extarcting words22:50

Visual words23:16

Object recognition using visual words23:24

Now this starts to look like a learning theory problem24:15

What else do we want, to make progress?24:47

My poll of the top researchers in computer vision24:59

“How do you think computer science can best help computer vision?” - 125:21

“How do you think computer science can best help computer vision?” - 225:27

“How do you think computer science can best help computer vision?” - 325:36

Nearest neighbor search in high dimensions25:51

Fast Approximate Nearest Neighbors With Automatic Algorithm Configuration26:25

Comparison of different algorithms26:34

Additional structure present in NN problems for computer vision27:30

Another NN search problem, with structure: non-local means denoising28:31

Non-local means denoising algorithm28:52

An approx nearest-neighbor algo. that takes image spatial structure into account - 130:03

An approx nearest-neighbor algo. that takes image spatial structure into account - 230:20

An approx nearest-neighbor algo. that takes image spatial structure into account - 330:26

An approx nearest-neighbor algo. that takes image spatial structure into account - 430:36

An approx nearest-neighbor algo. that takes image spatial structure into account - 530:48

An approx nearest-neighbor algo. that takes image spatial structure into account - 630:58

An approx nearest-neighbor algo. that takes image spatial structure into account - 731:02

An approx nearest-neighbor algo. that takes image spatial structure into account - 831:30

An approx nearest-neighbor algo. that takes image spatial structure into account - 931:44

Problem - 134:11

Another commonly expressed need: help in scaling up algorithms34:27

Problem - 235:06

Outline - 435:11

Priors on images35:30

Removing camera shake - 136:42

Removing camera shake - 237:02

Close-up - 137:17

Close-up - 237:18

Close-up - 337:20

Image formation process37:29

Multiple possible solutions - 138:04

Multiple possible solutions - 238:08

Multiple possible solutions - 338:23

Multiple possible solutions - 438:30

Multiple possible solutions - 538:49

Is each of the images that follow sharp or blurred? 38:51

Picture - 138:57

Picture - 239:02

Picture - 339:04

Natural image statistics39:19

Blury images have different statistics40:05

Parametric distribution40:18

Three sources of information - 140:26

Three sources of information - 240:28

Three sources of information - 340:34

Three sources of information - 440:39

Bayesian estimate of latent image, x, and blur kernel, b.40:45

Original photograph41:23

Our output41:32

Matlab's deconvblind41:43

Close-up of garland41:48

A stronger prior might help us deblur this image42:06

Problem - 342:22

Texture Synthesis by Non-parametric Sampling42:30

Algorithm43:15

Picture - 443:27

Picture - 543:34

Picture - 643:38

Problem - 443:53

Special case of an image prior44:14

Some methods of approximate inference in MRF’s45:28

MRF/CRF wishes - 145:55

Input image46:10

MRF/CRF wishes - 246:57

Image Segmentation with Bounding Box Prior47:31

Other constraints48:37

Problem - 548:46

Outline - 549:01

Compressed sensing - 149:06

Compressed sensing - 249:15

Problem - 649:33

Large, noisy datasets49:45

Problem - 750:48

Shai Avidan50:51

Blind vision50:55

Problem - 850:57

me51:55

Continuous to discrete representations51:58

Problem - 952:52

Deva Ramanan52:59

Evaluate easily over a powerset of all segmentations53:01

Problem - 1053:24

Alyosha Efros53:28

Efros and Hoiem comments53:31

Really, we’d like another breakthrough...54:02

David Lowe - 154:18

David Lowe - 254:23

Most references are in citation list of this manuscript54:45

Computer vision academic culture54:57

A computer graphics application of nearest ‐ neighbor finding in high dimensions - 156:12

A computer graphics application of nearest ‐ neighbor finding in high dimensions - 256:17

The image database56:40

Obtaining semantically coherent themes56:46

Image representation57:15

Basic camera motions57:40

Scene matching with camera view transformations: Translation - 157:41

Scene matching with camera view transformations: Translation - 257:48

Scene matching with camera view transformations: Translation - 358:03

Scene matching with camera view transformations: Translation - 458:14

Scene matching with camera view transformations: Translation - 558:20

Scene matching with camera view transformations: Translation - 658:29

Scene matching with camera view transformations: Camera rotation - 158:55

Scene matching with camera view transformations: Camera rotation - 259:03

Scene matching with camera view transformations: Camera rotation - 359:07

Scene matching with camera view transformations: Camera rotation - 459:08

Scene matching with camera view transformations: Camera rotation - 559:09

More “infinite” images – camera translation59:11

Video - 159:19

Video - 259:47

Video - 301:00:08