Rational Kernels: A General Machine Learning Framework for the Analysis of Text, Speech and Biological Sequences
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Kernel methods are widely used in statistical learning techniques due to their excellent performance and their computational efficiency in high-dimensional feature space. However, text or speech data cannot always be represented by the fixed-length vectors that the traditional kernels handle. In this talk, we introduce a general framework, Rational Kernels, that extends kernel techniques to deal with variable-length sequences and more generally to deal with large sets of weighted alternative sequences represented by weighted automata. Far from being abstract and computationally complex objects, rational kernels can be readily implemented using general weighted automata algorithms that have been extensively used in text and speech processing and that we will briefly review. Rational kernels provide a general framework for the definition and design of similarity measures between word or phone lattices particularly useful in speech mining applications. Viewed as a similarity measure, they can also be used in Support Vector Machines and significantly improve the spoken-dialog classification performance in difficult tasks such as the AT&T 'How May I Help You' (HMIHY) system. We present several examples of rational kernels to illustrate these applications. We finally show that many string kernels commonly considered in computational biology applications are specific instances of rational kernels.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !