Connectionist Temporal Classification for End-to-End Speech Recognition

author: Florian Metze, Language Technologies Institute, Carnegie Mellon University
published: July 31, 2016,   recorded: July 2016,   views: 1958
Categories

Slides

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs). Despite this progress, building a new ASR system remains a challenging task, requiring various resources, multiple training stages and significant expertise. In this talk, I will present an approach that drastically simplifies building acoustic models for the existing weighted finite state transducer (WFST) based decoding approach, and lends itself to end-to-end speech recognition, allowing optimization for arbitrary criteria. Acoustic modeling now involves learning a single recurrent neural network (RNN), which predicts context-independent targets (e.g., syllables, phonemes or characters). The connectionist temporal classification (CTC) objective function marginalizes over all possible alignments between speech frames and label sequences, removing the need for a separate alignment of the training data. We present a generalized decoding approach based on weighted finite-state transducers (WFSTs), which enables the efficient incorporation of lexicons and language models into CTC decoding. Experiments show that this approach achieves state-of-the-art word error rates, while drastically reducing complexity and speeding up decoding when compared to standard hybrid DNN systems.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: