Deep Learning with Multiplicative Interactions

author: Geoffrey E. Hinton, Department of Computer Science, University of Toronto
published: Jan. 20, 2010,   recorded: December 2009,   views: 2162
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Deep learning with multiplicative interactions
1:32 Overview
2:10 Restricted Boltzmann Machines
2:58 The Energy of a joint configuration (ignoring terms to do with biases)
3:34 Using energies to define probabilities
4:01 A picture of the maximum likelihood learning algorithm for an RBM
4:56 A quick way to learn an RBM
5:29 Training a deep network (the main reason RBM’s are interesting)
6:31 Fine-tuning for discrimination
7:34 Why unsupervised pre-training makes sense
9:16 A neat application of deep learning
11:10 One very deep belief net for phone recognition
11:39 A simple real-valued visible unit
11:42 One very deep belief net for phone recognition
11:58 A simple real-valued visible unit
12:51 The new idea
13:38 Generating the parts of an object: why multiplicative interactions are useful
14:37 Generating the parts of an object
15:24 Towards a more powerful, multi-linear stackable learning module
16:04 Higher order Boltzmann machines (Sejnowski, ~1986)
16:46 Using higher-order Boltzmann machines to model image transformations
18:23 A picture of the rank 1 tensor contributed by factor f
18:36 Factoring three-way multiplicative interactions
19:52 Inference with factored three-way multiplicative interactions
21:28 Belief propagation
22:08 Learning with factored three-way multiplicative interactions
23:16 Showing what a factor learns by alternating between its pre- and post- fields
24:07 The factor receptive fields, 1
24:16 The factor receptive fields, 2
24:19 The factor receptive fields, 1
24:20 The factor receptive fields, 2
24:22 The factor receptive fields, 1
24:25 The factor receptive fields, 2
24:44 The factor receptive fields (spirals), 1
24:46 The factor receptive fields (spirals), 2
24:48 The factor receptive fields (spirals), 1
24:51 The factor receptive fields (spirals), 2
24:57 How does it perceive two overlaid sparse dot patterns moving in different directions
26:54 Time series models
27:39 The conditional RBM model (a partially observed bipartite CRF)
28:49 Causal generation from a learned model
29:22 Higher level models
29:52 An application to modeling motion capture data
30:55 Using a style variable to modulate the interactions
32:29 Show demo’s of multiple styles of walking
34:56 Modeling the covariance structure of a static image by using two copies of the image
36:08 An advantage of modeling covariances between pixels rather than pixels
36:57 Using linear filters to model the inverse covariance matrix of two pixel intensities
37:17 Modulating the precision matrix by using additive contributions that can be switched off
37:58 Using binary hidden units to remove violated smoothness constraints
39:47 Inference with hidden units that represent active smoothness constraints
40:33 Learning with an adaptive precision matrix
40:50 Hybrid Monte Carlo
41:43 mcRBM (mean and covariance RBM
43:27 Why is the map topographic?
44:35 Multiple reconstructions from the same hidden state of a mcRBM
45:45 Test examples from the CIFAR-10 dataset
47:12 Application to the CIFAR-10 labeled subset of the TINY images dataset (Marc’Aurelio Ranzato)
48:01 How well does it discriminate?
48:27 Percent correct on CIFAR-10 test data
49:32 Summary
50:20 THE END, Questions

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Description

Deep networks can be learned efficiently from unlabeled data. The layers of representation are learned one at a time using a simple learning module that has only one layer of latent variables. The values of the latent variables of one module form the data for training the next module. The most commonly used modules are Restricted Boltzmann Machines or autoencoders with a sparsity penalty on the hidden activities. Although deep networks have been quite successful for tasks such as object recognition, information retrieval, and modeling motion capture data, the simple learning modules do not have multiplicative interactions which are very useful for some types of data. The talk will show how a third-order energy function can be factorized to yield a simple learning module that retains advantageous properties of a Restricted Boltzmann Machine such as very simple exact inference and a very simple learning rule based on pair-wise statistics. The new module contains multiplicative interactions that are useful for a variety of unsupervised learning tasks. Researchers at the University of Toronto have been using this type of module to extract oriented energy from image patches and dense flow fields from image sequences. The new module can also be used to allow the style of a motion to blend auto regressive models of motion capture data. Finally, the new module can be used to combine an eye-position with a feature-vector to allow a system that has a variable resolution retina to integrate information about shape over many fixations.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: