Modelling Intra-Speaker Variability for Improved Speaker Recognition
Description
In this paper we present a speaker recognition algorithm that models explicitly intra-speaker inter-session variability. Such variability, which is caused by channel, noise and temporary speaker characteristics (mood, fatigue, etc.), is not modeled explicitly by the state-of-the-art speaker recognition algorithms. We define a session-space in which each session (either train or test spoken utterance) is a vector. We then calculate a rotation of the session-space for which the estimated intra-speaker subspace is trivially isolated and can be modeled explicitly. Due to the high dimensionality of the session-space, it is impossible to use standard orthogonalization methods. We therefore used QR factorization based on Givens rotations to calculate the projection. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 23% compared to the classic GMM state-of-the-art algorithm.
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !


