Learning Feature Hierarchies
author:
Yann LeCun,
Computer Science Department, New York University
You might be experiencing some problems with Your Video player.
| Slides | |
| 0:00 | Learning Feature Hierarchies |
| 0:30 | The Next Frontier in Machine Learning: Learning Representations |
| 2:07 | The Traditional “Shallow” Architecture for Recognition |
| 2:41 | The Next Challenge of ML, Vision (and Neuroscience) |
| 3:20 | Good Representations are Hierarchical |
| 4:27 | “Deep” Learning: Learning Hierarchical Representations |
| 4:47 | The Primate's Visual System is Deep |
| 5:49 | Do we really need deep architectures? |
| 6:35 | Why are Deep Architectures More Efficient? |
| 7:05 | Feature Extraction in Computer Vision |
| 9:23 | Trainable Feature Extraction: HubelWiesel Stage |
| 10:27 | Deep Architecture: MultiStage HubelWiesel Architecture |
| 11:02 | Deep Architecture: The Multistage HubelWiesel Architecture |
| 13:54 | Convolutional Net: Supervised MultiStage HubelWiesel Arch. |
| 14:18 | Supervised Training of Convolutional Network |
| 14:44 | Supervised Convolutional Nets learn well with lots of data |
| 14:47 | NORB Generic Object Recognition Dataset |
| 14:48 | Textured and Cluttered Datasets |
| 14:49 | Face Detection: Results |
| 14:51 | Face Detection and Pose Estimation: Results |
| 14:52 | Face Detection with a Convolutional Net |
| 14:52 | Industrial Applications of (supervised) ConvNets |
| 15:04 | Problem: ConvNets don't work when labeled samples are scarse |
| 16:34 | How Do We Learn Features from Unlabeled Samples? |
| 17:45 | Deep Learning: Stack of Encoder/Decoders (1) |
| 18:47 | Deep Learning: Stack of Encoder/Decoders (2) |
| 19:05 | Deep Learning: Stack of Encoder/Decoders (3) |
| 19:18 | Deep Learning: Stack of Encoder/Decoders (4) |
| 19:42 | Training an Encoder/Decoder Module |
| 20:53 | Each Stage is Trained as an Estimator of the Input Density |
| 21:12 | Energy <> Probability |
| 21:25 | The Intractable Normalization Problem |
| 22:09 | Training an EnergyBased Model to Approximate a Density |
| 23:06 | Training an EnergyBased Model with Gradient Descent |
| 23:44 | Solving The Intractable Normalization problem |
| 23:55 | Training an EnergyBased Model with Gradient Descent |
| 24:14 | Solving The Intractable Normalization problem |
| 25:02 | The Main Insight [Ranzato et al. 2007] |
| 25:08 | Why Limit the Information Content of the Code? (1) |
| 25:19 | Why Limit the Information Content of the Code? (2) |
| 25:58 | Why Limit the Information Content of the Code? (3) |
| 26:14 | Why Limit the Information Content of the Code? (4) |
| 26:22 | Why Limit the Information Content of the Code? (5) |
| 26:29 | Why Limit the Information Content of the Code? (6) |
| 27:05 | Sparsity Penalty to Restrict the Code |
| 27:19 | Why Limit the Information Content of the Code? (6) |
| 27:49 | Sparsity Penalty to Restrict the Code |
| 28:11 | Sparse Decomposition with Linear Reconstruction |
| 30:12 | Problem with Sparse Decomposition: It's slow |
| 30:50 | Solution: Predictive Sparse Decomposition (PSD) |
| 34:04 | PSD: Inference |
| 34:46 | PSD: Learning [Kavukcuoglu et al. 2009] |
| 34:47 | PSD: Learning Algorithm |
| 35:15 | Decoder Basis Functions on MNIST |
| 36:06 | PSD Training on Natural Image Patches |
| 36:51 | How well do PSD features work on Caltech101? |
| 37:39 | Procedure for a singlestage system |
| 37:57 | Using PSD Features for Recognition |
| 38:04 | Feature Extraction (1) |
| 38:07 | Feature Extraction (2) |
| 38:08 | Feature Extraction (4) |
| 38:09 | Feature Extraction (5) |
| 38:11 | Feature Extraction (6) |
| 38:12 | Feature Extraction (7) |
| 38:13 | Feature Extraction (8) |
| 38:27 | Feature Extraction (10) |
| 38:28 | Feature Extraction (9) |
| 38:28 | Feature Extraction (11) |
| 38:32 | Training Protocol |
| 39:17 | Using PSD Features for Recognition (1) |
| 41:08 | Using PSD Features for Recognition (2) |
| 41:18 | Using PSD Features for Recognition (1) |
| 41:41 | Using PSD Features for Recognition (2) |
| 41:57 | Comparing Optimal Codes Predicted Codes on Caltech 101 |
| 41:59 | Training a MultiStage HubelWiesel Architecture with PSD |
| 42:39 | Multistage HubelWiesel Architecture on Caltech101 |
| 42:40 | Multistage HubelWiesel Architecture |
| 42:42 | Multistage HubelWiesel Architecture on Caltech101 |
| 44:45 | TwoStage Result Analysis |
| 44:51 | Multistage HubelWiesel Architecture: Filters |
| 44:55 | MNIST dataset (1) |
| 44:58 | MNIST dataset (2) |
| 45:43 | Why Random Filters Work? |
| 47:23 | Small NORB dataset (1) |
| 47:26 | Small NORB dataset (2) |
| 49:50 | Learning Invariant Features [Kavukcuoglu et al. CVPR 2009] |
| 50:11 | Learning the filters and the pools (1) |
| 50:15 | Learning Invariant Features [Kavukcuoglu et al. CVPR 2009] |
| 50:54 | Learning the filters and the pools (1) |
| 51:57 | Learning the filters and the pools (2) |
| 52:40 | Pinwheels? |
| 52:40 | Invariance Properties Compared to SIFT |
| 52:43 | Learning Invariant Features |
| 52:46 | Recognition Accuracy on Caltech 101 |
| 53:16 | FPGA Custom Board: NYU ConvNet Proc |
| 53:20 | DARPA/LAGR: Learning Applied to Ground Robotics |
| 54:35 | Long Range Vision: Distance Normalization |
| 54:41 | Long Range Vision Results (1) |
| 54:59 | Long Range Vision Results (2) |
| 55:07 | The End |
| 55:26 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Visitors who watched this lecture also watched...
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




