event thumbnail image
JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms
Pascal

Tandem Connectionist Feature Extraction for Conversational Speech Recognition

author: Qifeng Zhu, ICSI

Description

Multi-Layer Perceptrons (MLPs) can be used in automatic speech recognition in many ways. A particular application of this tool over the last few years has been the Tandem approach, as described by Hermansky et al in a number of publications. Here we discuss the characteristics of the MLP-based features used for the Tandem approach, and conclude with a report on their application to conversational speech recognition. The paper shows that MLP transformations yield variables that have regular distributions, which can be further modified by using logarithm to make the distribution easier to model by a Gaussian-HMM. Two or more vectors of these features can easily be combined without increasing the feature dimension. We also report recognition results that show that MLP features can significantly improve recognition performance for the NIST 2001 Hub-5 evaluation set with models trained on the Switchboard Corpus, even for complex systems incorporating MMIE training and other enhancements.

You might be experiencing some problems with Your Video player.
Slides
0:01 Tandem Connectionist Feature Extraction for Conversational Speech Recognition
0:23 Using Multi-Layer Perceptron (MLP) in Feature Extraction for Speech Recognition
1:41 MLP outputs as features to HMM
3:04 *1 Simple and Regular Within-Class Distribution
4:09 Exp. 1: Posterior Feature Space
4:55 Exp. 2: Log Posterior Feature Space
5:48 Exp. 3: Typical Distributions of Log Posteriors in Histogram
6:24 *2 Reducing Speaker Variation
8:09 Exp. 4: Variances of (Speaker Adaptive Training) SAT Transforms for Different Speakers
9:18 *3 Feature Combination: Better Performance, No Dimensionality Increase
10:23 Usually What to Expect for a Feature Transform
11:06 The Feature Generation Diagram
12:30 Some Practical Details in Feature Generation and HMM Decoding
13:31 Recognition Experiments
14:30 Recognition with a ‘Plain’ System with ML Training
15:29 Concerns for a Novel Feature: Scale and Carry Through
16:04 Results with Adaptation
16:49 Results in a Full-Fledged System
18:26 Summary

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Joey, June 19, 2007 at 1:24 p.m.:

The audio failed on this lecture after approx. 3mins 27secs. Is this a general problem with this lecture or is it just an issue for me?


Write your own review or comment: