Support vector machines and Joint Factor Analysis for speaker verification (original) (raw)
Related papers
Support vector machines for speaker verification and identification
2000
In this paper the performance of the support vector machine (SVM) on a speaker verification task is assessed. Since speaker verification requires binary decisions, support vector machines seem to be a promising candidate to perform the task. A new technique for normalising the polynomial kernel is developed and used to achieve performance comparable to other classifiers on the YOHO database. We also present results on a speaker identification task.
A straightforward and efficient implementation of the factor analysis model for speaker verification
2007
For a few years, the problem of session variability in text- independent automatic speaker verification is being tackled ac- tively. A new paradigm based on a factor analysis model have successfully been applied for this task. While very efficient, its implementation is demanding. In this paper, the algorithms in- volved in the eigenchannel MAP model are written down for a straightforward implementation, without referring to previous work or complex mathematics. In addition, a different com- pensation scheme is proposed where the standard GMM likeli- hood can be used without any modification to obtain good per- formance (even without the need of score normalization). The use of the compensated supervectors within a SVM classifier through a distance based kernel is also investigated. Experi- ments results shows an overall 50% relative gain over the stan- dard GMM-UBM system on NIST SRE 2005 and 2006 proto- cols (both at the DCFmin and EER). Index Terms: Speaker Verification, Session ...
Text-independent speaker verification using support vector machines
2001: A Speaker …, 2001
In this article we address the issue of using the Support Vector Learning technique in combination with the currently well performing Gaussian Mixture Models (GMM) for speaker verification experiments. Support Vector Machines (SVM) is a new and very promising technique in statistical learning theory. Recently this technique produced very interesting results in image processing [1] [2] [3], and for the fusion of the experts in biometric authentication .
SVMSVM: support vector machine speaker verification methodology
2003
Support vector machines with the Fisher and score-space kernels are used for text independent speaker verification to provide direct discrimination between complete utterances. This is unlike approaches such as discriminatively trained Gaussian mixture models or other discriminative classifiers that discriminate at the frame-level only. Using the sequence-level discrimination approach we are able to achieve error-rates that are significantly better than the current state-of-the-art on the PolyVar database. § § §
Support vector GMMs for speaker verification
Speaker and Language Recognition …, 2006
This article presents a new approach using the discrimination power of Support Vectors Machines (SVM) in combination with Gaussian Mixture Models (GMM) for Automatic Speaker Verification (ASV). In this combination SVMs are applied in the GMM model space. Each point of this space represents a GMM speaker model. The kernel which is used for the SVM allows the computation of a similarity between GMM models. It was calculated using the Kullback-Leibler (KL) divergence. The results of this new approach show a clear improvement compared to a simple GMM system on the NIST2005 Speaker Recognition Evaluation primary task.
Speaker Verification Using Accumulative Vectors with Support Vector Machines
Lecture Notes in Computer Science, 2013
The applications of Support Vector Machines (SVM) in speaker recognition are mainly related to Gaussian Mixtures and Universal Background Model based supervector paradigm. Recently, has been proposed a new approach that allows represent each acoustic frame in a binary discriminant space. Also a representation of a speaker-called accumulative vectors-obtained from the binary space has been proposed. In this article we show results obtained using SVM with the accumulative vectors and Nuisance Attribute Projection (NAP) as a method for compensating the session variability. We also introduce a new method to counteract the effects of the signal length in the conformation of the accumulative vectors to improve the performance of SVM.
Speaker Verification Using Support Vector Machines and High-Level Features
IEEE Transactions on Audio, Speech and Language Processing, 2007
High-level characteristics such as word usage, pronunciation, phonotactics, prosody, etc., have seen a resurgence for automatic speaker recognition over the last several years. With the availability of many conversation sides per speaker in current corpora, high-level systems now have the amount of data needed to sufficiently characterize a speaker. Although a significant amount of work has been done in finding novel high-level features, less work has been done on modeling these features. We describe a method of speaker modeling based upon support vector machines. Current high-level feature extraction produces sequences or lattices of tokens for a given conversation side. These sequences can be converted to counts and then frequencies of-gram for a given conversation side. We use support vector machine modeling of these n-gram frequencies for speaker verification. We derive a new kernel based upon linearizing a log likelihood ratio scoring system. Generalizations of this method are shown to produce excellent results on a variety of high-level features. We demonstrate that our methods produce results significantly better than standard log-likelihood ratio modeling. We also demonstrate that our system can perform well in conjunction with standard cesptral speaker recognition systems.
Combining GMM's with Suport Vector Machines for Text-independent Speaker Verification
… Conference on Speech …, 2001
Current best performing speaker recognition algorithms are based on Gaussian Mixture Models (GMM). Their results are not satisfactory for all experimental conditions, especially for the mismatched (train/test) conditions. Support Vector Machine is a new and very promissing technique in statistical learning theory. Recently, this technique produced very interesting results in image processing [2], [3], [4] and for the fusion of experts in biometric authentification . In this paper we address the issue of using the Support Vector Learning technique in combination with the currently well performing GMM models, in order to improve speaker verification results.
Factored covariance modeling for text-independent speaker verification
International Conference on Acoustics, Speech, and Signal Processing, 2011
Gaussian mixture models (GMMs) are commonly used to model the spectral distribution of speech signals for text-independent speaker verification. Mean vectors of the GMM, used in conjunction with support vector machine (SVM), have shown to be effective in characterizing speaker information. In addition to the mean vectors, covariance matrices capture the correlation between spectral features, which also represent some salient information about speaker identity. This paper investigates the use of local correlation between different dimensions of acoustic vector by using factor analysis and linear Gaussian model. Log-Euclidean inner product kernel is used to measure the similarity between two speech utterances in the form of covariance matrices. Experiments carried on NIST 2006 speaker verification tasks shows promising results.
Speaker verification using sequence discriminant support vector machines
IEEE Transactions on Speech and Audio Processing, 2005
This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system. Index Terms-Fisher kernel, score-space kernel, speaker verification, support vector machine. Vincent Wan received a BA in