Cocktail Party Problem as Binary Classification

author: DeLiang Wang, Department of Computer Science and Engineering, Ohio State University
published: July 30, 2009,   recorded: June 2009,   views: 6633


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Speech segregation, or the cocktail party problem, has proven to be extremely challenging. Part of the challenge stems from the lack of a carefully analyzed computational goal. While the separation of every sound source in a mixture is considered the gold standard, I argue that such an objective is neither realistic nor what the human auditory system does. Motivated by the auditory masking phenomenon, we have suggested instead the ideal time-frequency (T-F) binary mask as a main goal for computational auditory scene analysis. Ideal binary masking retains the mixture energy in T-F units where the local signal-to-noise ratio exceeds a certain threshold, and rejects the mixture energy in other T-F units. Recent psychophysical evidence shows that ideal binary masking leads to large speech intelligibility improvements in noisy environments for both normal-hearing and hearing-impaired listeners. The effectiveness of the ideal binary mask implies that sound separation may be formulated as a case of binary classification, which opens the cocktail party problem to a variety of pattern classification and clustering methods. As an example, I discuss a recent system that segregates unvoiced speech by supervised classification of acoustic-phonetic features.

See Also:

Download slides icon Download slides: mlss09us_wang_cppbc_01.ppt (4.2┬áMB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: