0.25
0.5
0.75
1.25
1.5
1.75
2
From Language Modelling to Machine Translation
Published on Sep 13, 20156984 Views
Related categories
Chapter list
From Language Modelling to Machine Translation00:00
Language models: the traditional view - 100:36
Language models: the traditional view - 202:21
History: cryptography04:23
N-gram language models05:28
The Traditional Markov Chain06:30
Estimating N-Gram Probabilities07:30
How good is a LM?08:15
Comparison 1–4-Gram12:00
Unseen N-Grams12:56
Add-One Smoothing13:39
Add-α Smoothing14:37
Example: 2-Grams in Europarl14:54
Good-Turing Smoothing16:33
Good-Turing for 2-Grams in Europarl18:06
Back-Off - 118:56
Back-Off - 219:53
Back-Off with Good-Turing Smoothing20:59
Diversity of Predicted Words21:47
Diversity of Histories22:56
Evaluation23:57
Provisional Summary - 126:01
Provisional Summary - 226:45
Neural language models28:33
Log-linear models for classification29:11
A simple log-linear (tri-gram) language model - 130:23
A simple log-linear (tri-gram) language model - 231:56
Learning the features: the log-bilinear language model - 132:24
Learning the features: the log-bilinear language model - 233:14
Learning the features: the log-bilinear language model - 333:58
Learning the features: the log-bilinear language model - 434:22
Learning the features: the log-bilinear language model - 534:33
Learning the features: the log-bilinear language model - 634:48
Adding non-linearities: the neural language model - 235:28
Infinite context: a recurrent neural language model - 135:47
Infinite context: a recurrent neural language model - 237:35
Infinite context: a recurrent neural language model - 337:43
LSTM LM39:01
Infinite context: a recurrent neural language model41:33
Deep LSTM LM - 141:45
Deep LSTM LM - 242:27
Deep LSTM LM - 342:31
Efficiency - 144:14
Efficiency - 246:03
Efficiency - 347:23
Efficiency - 448:51
Efficiency - 549:35
Efficiency - 650:08
Comparison with traditional n-gram LMs50:20
Learning better representations for rich morphology - 151:32
Learning better representations for rich morphology - 251:36
Learning representations directly51:36
Intro to MT51:37
Intro to MT: Language Divergence52:39
Models of translation53:29
MT History - 154:02
MT History - 254:57
Parallel Corpora - 155:40
Parallel Corpora - 256:35
MT History: Statistical MT at IBM - 157:38
MT History: Statistical MT at IBM - 258:26
Models of translation - 159:38
Models of translation - 201:00:32
IBM Model 1: The first translation attention model! 01:00:46
Models of translation - 101:02:27
Models of translation - 201:02:45
Models of translation - 301:02:47
Models of translation - 401:02:53
Models of translation - 501:02:56
Models of translation - 601:03:21
Encoder-Decoders - 101:03:54
Encoder-Decoders - 201:05:00
Encoder-Decoders: A naive additive model - 101:06:13
Encoder-Decoders: A naive additive model - 201:06:52
Encoder-Decoders: A naive additive model - 301:06:53
Encoder-Decoders: A naive additive model - 401:06:55
Encoder-Decoders: A naive additive model - 501:07:11
Encoder-Decoders: A naive additive model - 601:07:28
Encoder-Decoders: A naive additive model - 701:07:31
Encoder-Decoders: A naive additive model - 701:07:33
Encoder-Decoders: A naive additive model - 901:07:35
Encoder-Decoders: A naive additive model - 1001:07:42
Encoder-Decoders: A naive additive model - 1101:07:49
Encoder-Decoders: A naive additive model - 1201:07:56
Encoder-Decoders: A naive additive model - 1301:08:05
Recurrent Encoder-Decoders for MT - 101:10:05
Recurrent Encoder-Decoders for MT - 201:11:49
Recurrent Encoder-Decoders for MT - 301:12:33
Attention Models for MT - 101:13:46
Attention Models for MT - 201:14:41
Attention Models for MT - 301:16:34
Attention Models for MT - 401:16:44
Attention Models for MT - 501:16:48
Attention Models for MT - 601:16:57
Montreal WMT Bleu Scores01:18:40
Issues, advantages and the future of MT01:25:22