Machine Learning in Acoustic Signal Processing
published: July 30, 2009, recorded: June 2009, views: 14019
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
This tutorial presents a framework for understanding and comparing applications of pattern recognition in acoustic signal processing. Representative applications will be delimited by two binary features: (1) regression vs. (2) classification (inferred variables are continuous vs. discrete), (A) instantaneous vs. (B) dynamic. (1. Regression) problems include imaging and sound source tracking using a device with unknown properties, and inverse problems, e.g., articulatory estimation from speech audio. (2. Classification) problems include, e.g., the detection of syllable onsets and offsets in a speech signal, and the classification of non-speech audio events. (A. Instantaneous) inference is performed using a universal approximator (neural network, Gaussian mixture, kernel regression), constrained or regularized, if necessary, to reduce generalization error (resulting in a support vector machine, shrunk net, pruned tree, or boosted classifier combination). (B. Dynamic) inference methods apply prior knowledge of state transition probabilities, either in the form of a regularization term (e.g., using Bayesian inference) or in the form of set constraints (e.g., using linear programming) or both; examples include speech-to-text transcription, acoustic-to-articulatory inversion using a switching Kalman filter, and computation of the query presence probability in an audio information retrieval task.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !