A Shortcut into Speaker Verification (original) (raw)
Related papers
Iraqi Journal of Science
The theories and applications of speaker identification, recognition, and verification are among the well-established fields. Many publications and advances in the relevant products are still emerging. In this paper, research-related publications of the past 25 years (from 1996 to 2020) were studied and analysed. Our main focus was on speaker identification, speaker recognition, and speaker verification. The study was carried out using the Science Direct databases. Several references, such as review articles, research articles, encyclopaedia, book chapters, conference abstracts, and others, were categorized and investigated. Summary of these kinds of literature is presented in this paper, together with statistical analyses to represent the publications and their categories over the mentioned period. Important information, including the dataset used, the size of the data adopted, the implemented methods, and the accuracy of the obtained results in the analysed research, are extracted...
Speaker Verification and Identification
Intelligent Applications
A speaker recognition system verifies or identifies a speaker’s identity based on his/her voice. It is considered as one of the most convenient biometric characteristic for human machine communication. This chapter introduces several speaker recognition systems and examines their performances under various conditions. Speaker recognition can be classified into either speaker verification or speaker identification. Speaker verification aims to verify whether an input speech corresponds to a claimed identity, and speaker identification aims to identify an input speech by selecting one model from a set of enrolled speaker models. Both the speaker verification and identification system consist of three essential elements: feature extraction, speaker modeling, and matching. The feature extraction pertains to extracting essential features from an input speech for speaker recognition. The speaker modeling pertains to probabilistically modeling the feature of the enrolled speakers. The matc...
A Tutorial on Text-Independent Speaker Verification
Eurasip Journal on Advances in Signal Processing, 2004
This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a few research trends in speaker verification for the next couple of years.
Discriminative training of minimum cost speaker verification systems
1998
Ce papier présente une nouvelle méthode d'apprentissage pour les systèmes de vérification du locuteur. Cette méthode améliore les travaux précédents dans le domaine de vérification du locuteur en (1) développant un nouvel algorithme d'apprentissage discriminant a posteriori, et en (2)étendant l'algorithme pour optimiser directement les performances de la vérification du locuteur. L'élément clé de ce nouvel algorithme d'apprentissage améliorant l'état de l'art de la technologie initialise le système avec un modèle mélangé de Gauss modifié par des Bayesiens. L'algorithme d'apprentissage discriminant ajuste alors les paramètres de ces modèles pour directement minimiser une fonction du coût de la vérification (VCF) représentant le coût attendu des fausses acceptations des imposteurs et des faux rejets des locuteurs acceptables. Les résultats présentés proviennent du corpus de l'évaluation de la reconnaissance du locuteur du NIST en 1997 indiquant que la performance de la VCF peutêtre améliorée mais au depend d'une ré2duction de performance d'autres parties du système (différents coûts des fausses alarmes et des faux rejets).
Study of Speaker Verification Methods
Speaker verification is a process to accept or reject the identity claim of a speaker by comparing a set of measurements of the speaker‘s utterances with a reference set of measurements of the utterance of the person whose identity is claimed.. In speaker verification, a person makes an identity claim. There are two main stages in this technique, feature extraction and feature matching. Feature extraction is the process in which we extract some useful data which can later to be used to represent the speaker. Feature matching involves identification of the unknown speaker by comparing the feature extracted from the voice with the enrolled voices of known speakers.
Analysing the Performance of Speaker Verification Task using Different Features
International Journal of Computer Applications, 2013
Speaker recognition is the identification of the person who is speaking by characteristics of their voices, also called "voice recognition". The components of Speaker Recognition includes Speaker Identification(SI) and Speaker Verification(SV). Speaker identification is the task of determining an unknown speakers identity. If the speaker claims to be of a certain identity and the voice is to verify this claim, this is called Speaker Verification. It determines whether an unknown voice matches the known voice of a speaker whose identity is being claimed. This paper proposes Speaker Verification task. There are two phases in the Speaker Verification task namely, training and testing. In the training phase, different features such as Mel Frequency Cepstral Coefficient(MFCC), Linear Predictive Cepstral Coefficient(LPCC), Perceptual Linear Predictive(PLP) are extracted from the speech signal and is trained by Support Vector Machine to get the target speaker model. It is trained with both actual speaker and impostor utterances. In the testing phase, features are extracted from the test speech signal. The test features are extracted for different duration of time. The extracted feature vectors are given to the claimed speaker model and the decision is taken as authorised speaker or an impostor. The performance of a speaker verification task is analysed using different features with different utterance sizes. The result shows that the performance of a speaker verification task decreases when the duration of the speech utterances decreased.
Current speaker recognition applications involve the authentication of users by their voices for access to restricted information and privileges. The speech signal is often transmitted to the recognizer through communication channels presenting different transmission characteristics. The aim of this paper is to study the effects of speech bandwidth and coding schemes on speaker verification. We compared the performance of a Gaussian Mixture Model-Universal Background Model (GMM-UBM) classifier in two different conditions: in one condition, the system was trained and tested with speech processed using wideband codecs, and in the other with speech processed using narrowband codecs. Our results show that the verification task improves significantly when the system is trained and tested with speech transmitted through wideband channels.
Development of a Speech Corpus for Speaker Verification Research in Multilingual Environment
2013
Automatic Speaker Verification (ASV) refers to the task of verifying the claimed identity of a speaker based on speech data. The decision made by a Speaker Verification system is basically a binary decision returns either ―Yes‖ or ―No‖ based on the credibility of the claim, determined by some scoring techniques. The output of an automatic speaker verification system is highly dependent on database used for training and testing the system. The results obtained by the speaker verification system are meaningless if recording specifications and environment for training and testing data are not known. This paper describes methodology and experimental setup used for the development of a speech corpus for the evaluation of text- independent speaker verification system in multilingual environment. Four major languages of Arunachal Pradesh (a North-Eastern frontier state of India, boarding with China) Nyishi, Adi, Galo and Apatani along with English and Hindi have been considered for the dev...