Feature Selection Method for Speaker Recognition using Neural Network (original) (raw)
Related papers
Comparative Study of Different Techniques in Speaker Recognition: Review
The speech is most basic and essential method of communication used by person.On the basis of individual information included in speech signals the speaker is recognized. Speaker recognition (SR) is useful to identify the person who is speaking. In recent years speaker recognition is used for security system. In this paper we have discussed the feature extraction techniques like Mel frequency cepstral coefficient (MFCC), Linear predictive coding (LPC), Dynamic time wrapping (DTW), and for classification Gaussian Mixture Models (GMM), Artificial neural network (ANN)& Support vector machine (SVM).
On Feature Selection for Speaker Verification
2002
This paper describes an HMM based speaker verification system, which verifies speakers in their own specific feature space. This 'individual' feature space is determined by a Dynamic Programming (DP) feature selection algorithm. A suitable criterion, correlated with Equal Error Rate (EER) was developed and is used for this feature selection algorithm. The algorithm was evaluated on a text-dependent database. A significant improvement in verification results was demonstrated with the DP selected individual feature space. An EER of 4.8% was achieved when the feature set was the "almost standard" Mel Frequency Cepstrum Coefficients (MFCC) space (12 MFCC + 12 ∆MFCC). Under the same conditions, a system based on the selected feature space yielded an EER of only 2.7%.
Speaker recognition and verification using artificial neural network
2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), 2017
Speaker recognition is a biometrie technique which uses individual voice samples for recognition purpose. Speaker recognition is mainly divided into speaker identification and speaker verification. In this paper, a comparative study is made between various combinations of features for speaker identification. Mel frequency Cepstral Coefficient (MFCC) features are combined with spectral centroid and spectral subtraction and tested for improvement in efficiency. Feed forward artificial neural network is used as a classifier. System was tested for 30 speakers. For speaker identification, an average identification rate of 65.3% is achieved when MFCC is combined with centroid features and an identification rate of 60% is achieved when MFCC is combined with spectral subtraction. For speaker verification, an average verification rate of 65.7% is achieved when MFCC is combined with spectral subtraction and a verification rate of 75.3% is achieved when MFCC is used along with centroid.
The problem addressed in this paper is related to the fact that classical statistical approach for speaker recognition yields satisfactory results but at the expense of long length training and test utterances. An attempt to reduce the length of speaker samples is of great importance in the field of speaker recognition since the statistical approach, due to its limitations, is usually precluded from use in real-time applications. A novel method of text-independent speaker recognition which uses only the correlations among MFCCs, computed over selected speech segments of very-short length (approximately 120ms) is proposed. Three different neural networks -the Multi-Layer Perceptron (MLP), the Steinbuch's Learnmatrix (SLM) and the Self-Organizing Feature Finder (SOFF) -are evaluated in a speaker recognition task. The ability of dimensionality reduction of the SOFF paradigm is also discussed.
IJERT-A Study of Various Speech Features and Classifiers used in Speaker Identification
International Journal of Engineering Research and Technology (IJERT)`, 2016
https://www.ijert.org/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification https://www.ijert.org/research/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification-IJERTV5IS020637.pdf Speech processing consists of analysis/synthesis, recognition & coding of speech signal. The recognition field further branched to Speech recognition, Speaker recognition and speaker identification. Speaker identification system is used to identify a speaker among many speakers. To have a good identification rate is a prerequisite for any Speaker identification system which can be achieved by making an optimal choice among the available techniques. In this paper, different speech features & extraction techniques such as MFCC, LPCC, LPC, GLFCC, PLPC etc and different features classification models such as VQ, GMM, DTW, HMM and ANN for speaker identification system have been discussed. Keywords-Linear Predictive Cepstral Coefficients (LPCC), Mel Frequency Cepstral Coefficients (MFCC), Gaussian Mixture Model (GMM), Vector Quantization (VQ), Hidden Markov Model (HMM), Artificial Neural Network (ANN)
Neural network based speaker classification and verification systems with enhanced features
2017 Intelligent Systems Conference (IntelliSys), 2017
This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition. With optimized features and model training, it achieves 100% classification rate in classification and less than 6% Equal Error Rate (ERR), using merely about 1 second and 5 seconds of data respectively. Features with stricter Voice Active Detection (VAD) than the regular one for speech recognition ensure extracting stronger voiced portion for speaker recognition, speaker-level mean and variance normalization helps to eliminate the discrepancy between samples from the same speaker. Both are proven to improve the system performance. In building the neural network speaker classifier, the network structure parameters are optimized with grid search and dynamically reduced regularization parameters are used to avoid training terminated in local minimum. It enables the training goes further with lower cost. In speaker verification, performance is improved with prediction score normalization, which rewards the speaker identity indices with distinct peaks and penalizes the weak ones with high scores but more competitors, and speaker-specific thresholding, which significantly reduces ERR in the ROC curve. TIMIT corpus with 8K sampling rate is used here. First 200 male speakers are used to train and test the classification performance. The testing files of them are used as in-domain registered speakers, while data from the remaining 126 male speakers are used as out-of-domain speakers, i.e. imposters in speaker verification.
Performance Evaluation of Feature Extraction and Modeling Methods for Speaker Recognition
Annals of Reviews & Research, 2018
In this study, the performance of the prominent feature extraction and modeling methods in speaker recognition systems are evaluated on the specifically created database. The main feature of the database is that subjects are siblings or relatives. After giving the basic information about speaker recognition systems, outstanding properties of the methods are briefly mentioned. While Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) methods are preferred for feature extraction, Gaussian Mixture Model (GMM) and I-Vector methods are employed for modeling. The best results are tried to be obtained by changing the parameters of these methods. A number of features for LPCC and MFCC and number of mixture components for GMM are the parameters experimented by changing. The aim of this study is to find out which parameters of the most commonly used methods contribute the success and at the same time, to determine the best combination of feature extraction and modeling methods for the speakers having similar sounds. This study is also a good resource and guidance for the researchers in the area of speaker recognition.
A Study of Various Speech Features and Classifiers used in Speaker Identification
International Journal of Engineering Research and, 2016
Speech processing consists of analysis/synthesis, recognition & coding of speech signal. The recognition field further branched to Speech recognition, Speaker recognition and speaker identification. Speaker identification system is used to identify a speaker among many speakers. To have a good identification rate is a prerequisite for any Speaker identification system which can be achieved by making an optimal choice among the available techniques. In this paper, different speech features & extraction techniques such as MFCC, LPCC, LPC, GLFCC, PLPC etc and different features classification models such as VQ, GMM, DTW, HMM and ANN for speaker identification system have been discussed.
Speaker recognition using artificial neural networks
2002
This paper deals with the application of Radial Basis Function Neural Networks (RBFNNs) and Elliptical Basis Function Neural Networks (EBFNNs) for textindependent speaker recognition experiments. These include both closed-set and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 60 speakers from different dialect regions. LP-derived Cepstral Coefficients (LPCC) are used as the speaker specific features. Simulation results show that EBFNN outperform RBFNN for several speaker recognition experiments.
Speaker Recognition with various Feature extraction and classification Techniques: A review
2018
Speech is very natural form of human communication. Speech processing is one of the significant application area of digital signal processing. In speech processing the research developments like speech recognition, speaker recognition, speech synthesis, speaker identification, speech extraction and speech coding. In speaker recognition process who is speaking, is recognized automatically on the basis of individual information provided in speech wave. In speaker recognition technique the speaker’s speech is used to verify their identity and recognize using feature extraction techniques. The objective of this review paper is to summarize various feature extraction and classification techniques. Keywords— Analysis, segmentation, Feature extraction, speaker identification, matching, LPC, MFCC, RASTA filtering