The Robustness of GMM-SVM in Real World Applied to Speaker Verification (original) (raw)

Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker verification. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. In this work we look into the various models (GMM-UBM and GMM-SVM) and their application to speaker verification. In this paper, features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC) extracted from the speech signal are used to train the Gaussian mixture model (GMM) and mean vectors issued from GMM-UBM to train SVM. To fit the data around their average the cepstral mean subtraction (CMS) are applied on the MFCC. For both, GMM-UBM and GMM-SVM systems, 2048-mixture UBM is used. The verification phase was tested with Aurora database at different Signal-to-Noise Ratio (SNR) and under three noisy conditions. The experimental results showed the outperformance of GMM-SVM against GMM-UBM in speaker verification espe...