Multi-stream modeling with applications in speech and multimodal processing

author: Herve Bourlard, IDIAP Research Institute
published: Feb. 25, 2007,   recorded: September 2004,   views: 3519

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


After a brief discussion of the problem arising from the processing and modeling of multiple stream (multi-channel, multi-sensor) signals, we will discuss a few statistical structures (such as multi-stream HMM and asynchronous HMM) that can accommodate multiple (asynchnonous) observation streams (possibly exhibiting different frame rates). Indeed, it will be shown on different speech recognition and multimodal fusion tasks that it might sometimes be a good idea to be able to ``desynchronize'' the streams in order to maximize their joint likelihood. Different applications in speech recognition, such as multi-band and multi-stream speech processing, will be discussed. Finally, multimodal applications significantly benefiting from this multi-stream paradigm will also be discussed, including audio-visual speech recognition and modeling of human interaction in meetings (by modeling the joint behaviours of participants through multiple audio and visual features).

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: