Performance Evaluation of Different Modeling Methods and Classifiers with MFCC and IHC Features for Speaker Recognition (original) (raw)
Related papers
Procedia Computer Science, 2018
Voice is an important human trait in natural human-to-human interaction / communication for identifying a person. So voice can be regarded as a biometric measure for recognizing or identifying the person similar to other biometric measures such as face, iris and fingerprints. Speaker recognition is a class of voice recognition where speaker is identified from the speech rather than the message. Automatic speaker recognition (SR) is an approach to identify people based on features extracted from speech utterances. The major task in any speaker recognition is to extract useful features and allow meaningful patterns of speaker models. This paper compares the performance of two feature extraction techniques Mel Frequency Cepstral Coefficient (MFCC) and Inner Hair Cell Coefficient (IHC) with two different modelling methods Gaussian Mixture Model-Universal background model (GMM-UBM) and i-vector approach. In this experiment speech samples of 600 speakers from TIMIT database with 10 utterances of each speaker are taken for identifying the speaker. A text independent speaker recognition system was implemented and this study resulted in an inference of MFCC feature outperforms IHC feature for both GMM and i vector. The performance of voiced speech IHC feature which simulates the physiological behaviour of human ear is better in terms of accuracy than full speech (voiced and unvoiced) in GMM.
IJERT-A Study of Various Speech Features and Classifiers used in Speaker Identification
International Journal of Engineering Research and Technology (IJERT)`, 2016
https://www.ijert.org/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification https://www.ijert.org/research/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification-IJERTV5IS020637.pdf Speech processing consists of analysis/synthesis, recognition & coding of speech signal. The recognition field further branched to Speech recognition, Speaker recognition and speaker identification. Speaker identification system is used to identify a speaker among many speakers. To have a good identification rate is a prerequisite for any Speaker identification system which can be achieved by making an optimal choice among the available techniques. In this paper, different speech features & extraction techniques such as MFCC, LPCC, LPC, GLFCC, PLPC etc and different features classification models such as VQ, GMM, DTW, HMM and ANN for speaker identification system have been discussed. Keywords-Linear Predictive Cepstral Coefficients (LPCC), Mel Frequency Cepstral Coefficients (MFCC), Gaussian Mixture Model (GMM), Vector Quantization (VQ), Hidden Markov Model (HMM), Artificial Neural Network (ANN)
IJERT-Speaker Recognition System Based On MFCC and VQ Algorithms
International Journal of Engineering Research and Technology (IJERT), 2014
https://www.ijert.org/speaker-recognition-system-based-on-mfcc-and-vq-algorithms https://www.ijert.org/research/speaker-recognition-system-based-on-mfcc-and-vq-algorithms-IJERTV3IS20286.pdf The main aim of this paper is speaker recognition. This can be achieved by automatically identify who is speaking on the basis of individual information integrated in speech waves. Objective is comparing a speech signal from a unknown speaker to database of known speaker. The system can recognize the speaker, which has been trained with a number of speakers. Speaker recognition needed two task, "feature extraction" and "feature classification". Feature classification further divided in two task, pattern matching and decision. For feature extraction, we are using mel frequency cepstral coefficient (MFCC) method. For feature classification, we are using vector quantization (VQ) method. In the feature matching stage "Euclidean distance" is applied.
Comparative Study of Different Techniques in Speaker Recognition: Review
The speech is most basic and essential method of communication used by person.On the basis of individual information included in speech signals the speaker is recognized. Speaker recognition (SR) is useful to identify the person who is speaking. In recent years speaker recognition is used for security system. In this paper we have discussed the feature extraction techniques like Mel frequency cepstral coefficient (MFCC), Linear predictive coding (LPC), Dynamic time wrapping (DTW), and for classification Gaussian Mixture Models (GMM), Artificial neural network (ANN)& Support vector machine (SVM).
Person Identification through Voice using MFCC and Multi-class SVM
2018
123 Electronics and Communication Engineering, RCOEM, Nagpur 4 Asst. Professor, Dept. of Electronics and Communication Engineering, RCOEM, Nagpur ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract — Speaker recognition is basically identification and verification of an authorized personnel who is supposed to access the system. It is used as one of the biometric authentication process available in the world. The biometric verification plays a crucial role in security of the system. Unlike passwords, it cannot be copied from one person to another.
A Study of Various Speech Features and Classifiers used in Speaker Identification
International Journal of Engineering Research and, 2016
Speech processing consists of analysis/synthesis, recognition & coding of speech signal. The recognition field further branched to Speech recognition, Speaker recognition and speaker identification. Speaker identification system is used to identify a speaker among many speakers. To have a good identification rate is a prerequisite for any Speaker identification system which can be achieved by making an optimal choice among the available techniques. In this paper, different speech features & extraction techniques such as MFCC, LPCC, LPC, GLFCC, PLPC etc and different features classification models such as VQ, GMM, DTW, HMM and ANN for speaker identification system have been discussed.
Automatic Speaker Recognition using LPCC and MFCC
— A person's voice contains various parameters that convey information such as emotion, gender, attitude, health and identity. This report talks about speaker recognition which deals with the subject of identifying a person based on their unique voiceprint present in their speech data. Pre-processing of the speech signal is performed before voice feature extraction. This process ensures the voice feature extracti on contains accurate information that conveys the identity of the speaker. Voice feature extraction methods such as Linear Predictive Coding (LPC), Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are analysed and evaluated for their suitability for use in speaker recognition tasks. A new method which combined LPCC and MFCC (LPCC+MFCC) using fusion output was proposed and evaluated together with the different voice feature extraction methods. The speaker model for all the methods was computed using Vector Quantization-Linde, Buzo and Gray (VQ-LBG) method. Individual modelling and comparison for LPCC and MFCC is used for the LPCC+MFCC method. The similarity scores for both methods are then combined for identification decision. The results show that this method is better or at least comparable to the traditional methods such as LPCC and MFCC.
Speaker Recognition Systems in the Last Decade – A Survey
Engineering and Technology Journal
Speaker Recognition Defined by the process of recognizing a person by his\her voice through specific features that extract from his\her voice signal. An Automatic Speaker recognition (ASP) is a biometric authentication system. In the last decade, many advances in the speaker recognition field have been attained, along with many techniques in feature extraction and modeling phases. In this paper, we present an overview of the most recent works in ASP technology. The study makes an effort to discuss several modeling ASP techniques like Gaussian Mixture Model GMM, Vector Quantization (VQ), and Clustering Algorithms. Also, several feature extraction techniques like Linear Predictive Coding (LPC) and Mel frequency cepstral coefficients (MFCC) are examined. Finally, as a result of this study, we found MFCC and GMM methods could be considered as the most successful techniques in the field of speaker recognition so far.
This paper talks about speaker recognition as an ordinary process whereas speaker identification and speaker verification refer to definite tasks or assessment modes associated with this process.
Comparative Analysis of Different Windowing Techniques in MFCC Speaker Recognition
2014
Speaker recognition is the process of autom atically recognising the speaker on the basis of individual information included in speech waves. The objective of automatic speaker recognition is to extract, characterize and recogni ze the information about speaker identity. Speaker recogni tion technology can be used in many services such as voi ce dialling, banking by telephone, telephone shopping, database access services, information services, voice mail, securit y control for confidential information areas, and remote access t o computers. Feature extraction is an important process in speake r recognition. In this paper Mel Frequency Cepstrum Coef ficients method is used in order to design a text dependent speaker recognition system. Different types of windowing meth ods are used during feature extraction. In this paper, a co mparative analysis of different windowing techniques is done in order to determine the most effective windowing technique for MFCC speaker recognition.