Recurrent Neural Networks (RNNs)

Published on 2017-07-2721392 Views

DLSS & RLSS 2017 - Montreal

Recurrent Neural Networks00:00

Recurrent Neural Networks - 100:02

Recurrent Neural Networks - 210:10

Generative RNNs11:54

Conditional Distributions12:28

Maximum Likelihood = Teacher Forcing15:25

Ideas to reduce the train/generate mismatch in teacher forcing18:15

Multiplicative Interactions26:02

Bidirectional RNNs, Recursive Nets, Multidimensional RNNs, etc.26:38

Increasing the Expressive Power of RNNs with more Depth26:38

Learning Long-Term Dependencies with Gradient Descent is Difficult27:27

Simple Experiments from 1991 while I was at MIT27:54

Robustly storing 1 bit in the presence of bounded noise30:45

Storing Reliably - Vanishing gradients32:24

Vanishing or Exploding Gradients34:01

Why it hurts gradient-based learning34:46

Vanishing Gradients in Deep Nets are Different from the Case in RNNs38:18

To store information robustly the dynamics must be contractive39:49

RNN Tricks40:44

How to store 1 bit?42:00

Dealing with Gradient Explosion by Gradient Norm Clipping42:41

Conference version (1993) of the 1994 paper by the same authors44:11

Fighting the vanishing gradient: LSTM & GRU49:12

Fast Forward 20 years: Attention Mechanisms for Memory Access52:06

Large Memory Networks: Sparse Access Memory for Long-Term Dependencies54:32

Attention Mechanism for Deep Learning56:39

End-to-End Machine Translation with Recurrent Nets and Attention Mechanism01:00:18

Google-Scale NMT Success01:00:29

Pointing the Unknown Words01:02:33

It makes a difference01:06:00

Designing the RNN Architecture01:06:24

Near-Orthogonality to Help Information Propagation01:06:41

Variational Generative RNNs01:07:20

Variational Hierarchical RNNs for Dialogue Generation01:08:01

VHRNN Results – Twitter Dialogues01:08:22

Other Fully-Observed Neural Directed Graphical Models01:08:23

Neural Auto-Regressive Models01:08:24

NADE: Neural AutoRegressive Density Estimator01:11:36

Pixel RNNs01:12:19

Forward Computation of the Gradient01:12:53

Delays & Hierarchies to Reach Farther01:24:50