video thumbnail

Speech Recognition and Deep Learning

Published on 2015-09-139373 Views

Adam Coates

Deep Learning Summer School 2015 - Montreal

Related categories

Deep Learning Reinforcement Learning Unsupervised Learning

Presentation

Speech Recognition and Deep Learning00:00

Speech recognition - 101:09

Speech recognition - 202:40

Speech recognition - 303:23

Outline06:58

Traditional speech models08:34

Basic pipeline - 108:38

Basic pipeline - 209:28

Basic pipeline - 310:32

Basic pipeline - 412:36

Basic pipeline - 515:47

Basic pipeline - 616:41

Features18:02

Example: Spectrogram - 118:21

Example: Spectrogram - 219:31

Back to modeling24:54

Acoustic model24:57

Modeling 1 phoneme - 25:52

Modeling 1 phoneme - 227:40

Modeling 1 phoneme - 327:43

Modeling 1 phoneme - 427:53

Modeling 1 phoneme - 527:55

Modeling 1 phoneme - 627:57

Modeling 1 phoneme - 728:00

Modeling 1 phoneme - 828:01

Inference with 1 phoneme - 128:20

Inference with 1 phoneme - 230:10

Modeling a word31:43

Training from sentences - 132:50

Training from sentences - 234:26

Obstacles...36:11

Language modeling36:52

Putting it together39:54

Decoder - 140:34

Decoder - 240:53

Decoder - 341:18

Decoder visualization - 142:57

Decoder - 444:34

Decoder - 546:09

Decoder visualization - 247:59

Decoder visualization - 348:25

Decoder visualization - 448:33

Decoder visualization - 549:21

Decoder visualization - 649:32

Decoder visualization - 750:03

Decoder visualization - 850:10

Decoder visualization - 950:14

Decoder visualization - 1050:17

Decoder visualization - 1150:19

Decoder visualization - 1250:20

Decoder visualization - 1350:21

Decoder visualization - 1450:24

Decoder visualization - 1550:36

Decoder visualization - 1651:46

Decoder visualization - 1751:52

Decoder visualization - 1852:20

Decoder visualization - 1952:28

Is that all?53:30

Deep Learning56:04

Where can DL help?56:14

Basic pipeline56:52

DNN acoustic models - 158:25

DNN acoustic models - 259:49

DNN acoustic models - 301:00:36

DNN acoustic models - 401:02:46

Early wins for DNN models01:05:12

More powerful acoustic models01:06:18

Rescoring01:08:51

Rescoring with Neural LM01:09:45

Training from unsegmented data with CTC01:11:30

Complexity01:11:32

Network setup01:12:46

Problem01:15:25

Collapsing operator01:16:11

Likelihood of sequence01:17:38

Training - 101:18:50

Training - 201:18:57

Training - 301:19:23

Decoding - 101:21:17

Decoding - 201:21:26

Decoding - 301:21:56

End-to-end learning - 101:23:39

Example transcriptions01:24:38

Conclusion01:25:55

Thank you01:27:01