Using LR-based discriminant kernel methods with applications to speaker verification (original) (raw)
Related papers
Lecture Notes in Computer Science, 2006
In a log-likelihood ratio (LLR)-based speaker verification system, the alternative hypothesis is usually ill-defined and hard to characterize a priori, since it should cover the space of all possible impostors. In this paper, we propose a new LLR measure in an attempt to characterize the alternative hypothesis in a more effective and robust way than conventional methods. This LLR measure can be further formulated as a non-linear discriminant classifier and solved by kernel-based techniques, such as the Kernel Fisher Discriminant (KFD) and Support Vector Machine (SVM). The results of experiments on two speaker verification tasks show that the proposed methods outperform classical LLR-based approaches.
IEEE Transactions on Audio, Speech, and Language Processing, 2000
Speaker verification can be viewed as a task of modeling and testing two hypotheses: the null hypothesis and the alternative hypothesis. Since the alternative hypothesis involves unknown impostors, it is usually hard to characterize a priori. In this paper, we propose improving the characterization of the alternative hypothesis by designing two decision functions based, respectively, on a weighted arithmetic combination and a weighted geometric combination of discriminative information derived from a set of pre-trained background models. The parameters associated with the combinations are then optimized using two kernel discriminant analysis techniques, namely, the Kernel Fisher Discriminant (KFD) and Support Vector Machine (SVM). The proposed approaches have two advantages over existing methods. The first is that they embed a trainable mechanism in the decision functions. The second is that they convert variable-length utterances into fixed-dimension characteristic vectors, which are easily processed by kernel discriminant analysis. The results of speaker-verification experiments conducted on two speech corpora show that the proposed methods outperform conventional likelihood ratio-based approaches.
Evaluation of kernel methods for speaker verification and identification
2002
Support vector machines are evaluated on speaker verification and speaker identification tasks. We compare the polynomial kernel, the Fisher kernel, a likelihood ratio kernel and the pair hidden Markov model kernel with baseline systems based on a discriminative polynomial classifier and generative Gaussian mixture model classifiers. Simulations were carried out on the YOHO database and some promising results were obtained.
Optimization of discriminative kernels in SVM speaker verification
2009
An important aspect of SVM-based speaker verification systems is the design of sequence kernels. These kernels should be able to map variable-length observation sequences to fixed-size supervectors that capture the dynamic characteristics of speech utterances and allow speakers to be easily distinguished. Most existing kernels in SVM speaker verification are obtained by assuming a specific form for the similarity function of supervectors. This paper relaxes this assumption to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate a target speaker from impostors using either regression analysis or SVM training. The idea was applied to both low-and high-level speaker verification. In both cases, results show that the proposed kernels outperform the state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores.
18th International Conference on Pattern Recognition (ICPR'06), 2006
Real-word applications often involve a binary hypothesis testing problem with one of the two hypotheses ill-defined and hard to be characterized precisely by a single measure. In this paper, we develop a framework that integrates multiple hypothesis testing measures into a unified decision basis, and apply kernel-based classification techniques, namely, Kernel Fisher Discriminant (KFD) and Support Vector Machine (SVM), to optimize the integration. Experiments conducted on speaker verification demonstrate the superiority of our approaches over the predominant approaches.
Characterizing speech utterances for speaker verification with sequence kernel SVM
2008
Support vector machine (SVM) equipped with sequence kernel has been proven to be a powerful technique for speaker verification. A number of sequence kernels have been recently proposed, each being motivated from different perspectives with diverse mathematical derivations. Analytical comparison of kernels becomes difficult. To facilitate such comparisons, we propose a generic structure showing how different levels of cues conveyed by speech utterances, ranging from low-level acoustic features to highlevel speaker cues, are being characterized within a sequence kernel. We then identify the similarities and differences between the popular generalized linear discriminant sequence (GLDS) and GMM supervector kernels, as well as our own probabilistic sequence kernel (PSK). Furthermore, we enhance the PSK in terms of accuracy and computational complexity. The enhanced PSK gives competitive accuracy with the other two kernels. Fusing all the three kernels yields an EER of 4.83% on the 2006 NIST SRE core test.
Kernel multimodal discriminant analysis for speaker verification
2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010
In this paper, we propose a robust speaker feature extraction method using kernel multimodal Fisher discriminant analysis (kernel MFDA). Kernel MFDA has been designed to have the characteristics both of kernel principal component analysis (kernel PCA) and kernel Fisher discriminant analysis (kernel FDA). Therefore, the feature vectors extracted by kernel MFDA are denoised as well as discriminated. For evaluation, we compare our proposed method with principal component analysis (PCA) and kernel PCA on the speaker verification systems.
Exploring kernel discriminant analysis for speaker verification with limited test data
Pattern Recognition Letters, 2017
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Highlights • A novel framework for channel/session compensation in i-vector speaker modeling. • Explore non-linearity in channel/session information at i-vector framework. • Effectiveness of kernel discriminant analysis (KDA) with higher dimension. • Significance of KDA for speaker verification with limited test data.