K-mean Clustering and Arabic Vowels Formants Based Speaker Identification System (original) (raw)

MareText Independent Speaker Identification based on K-mean Algorithm

This paper proposes a text-independent speaker identification system based on Mel Frequency Cepstral Coefficients as a feature extraction and Vector Quantization technique that would minimize the data required for processing. The correlation between the identification success rate and the various parameters of the system including the feature extraction tools and the data minimization technique will be examined. Extracted features of a speaker are quantized by a number of centroids and the K-mean algorithm has been integrated into the proposed speaker identification system. Such centroids constitute the codebook of that speaker. MFCC are calculated in both training and testing phases. To calculate these MFCC speakers uttered different words, once in a training session and once in a testing one. The speakers were identified according to the minimum quantization distance which was calculated between the centroids of each speaker in the training phase and the MFCC of individual speakers in the testing phase. Analysis was carried out to identify parameter values that could be used to improve the performance of the system. The experimental results illustrate the efficiency of the proposed method under several conditions

Speaker Identification Using Discrete Wavelet Transform

Journal of Computer Science, 2014

This study presents an experimental evaluation of Discrete Wavelet Transforms for use in speaker identification. The features are tested using speech data provided by the CHAINS corpus. This system consists of two stages: Feature extraction stage and the identification stage. Parameters are extracted and used in a closed-set text-independent speaker identification task. In this study the signals are pre-processed and features are extracted using discrete wavelet transforms. The energy of the wavelet coefficients are used for training the Gaussian Mixture Model. Daubechies wavelets are used and the speech samples are analyzed using 8 levels of decomposition.

Comparative Study and Evaluation of Speaker Verification Systems on Various Techniques: A Literature Review

2016

Nowadays, speech recognition is become more and more important. Various speech applications are available in the market. Consumer electronic devices can operate through voice. In the proposed system mobile phone is used as a controller to control whole system. In mobile phones CELP (code excited linear prediction) method is used for speech coding. CELP based speaker verification method, the CELP encoding method used for mobile phone voice communication is applied to the encoded voice data to perform speaker verification. A fuzzy c-means algorithm is used, in fuzzy clustering data point can belong to more than one cluster, and associated with each of the points are membership grades which show the degree to which the data points belong to the various cluster. FCM algorithm gives better result for overlapped data set and is comparatively better than k-means algorithm.

Comparative Study of Different Techniques in Speaker Recognition: Review

The speech is most basic and essential method of communication used by person.On the basis of individual information included in speech signals the speaker is recognized. Speaker recognition (SR) is useful to identify the person who is speaking. In recent years speaker recognition is used for security system. In this paper we have discussed the feature extraction techniques like Mel frequency cepstral coefficient (MFCC), Linear predictive coding (LPC), Dynamic time wrapping (DTW), and for classification Gaussian Mixture Models (GMM), Artificial neural network (ANN)& Support vector machine (SVM).

Speaker Identification System for Hindi And Marathi Languages using Wavelet and Support Vector Machine

— In this paper, a speaker identification system using speech processing for Hindi and Marathi languages is developed. Database of common words between Hindi and Marathi languages whose script is common but pronunciation is different is created. Here feature extraction is performed by using Wavelet Packet Decomposition (WPD) and classification is performed by using Support Vector Machine (SVM). As compared to the conventional feature extraction techniques wavelet transform is very much suitable for processing speech signals which are non-stationary in nature because of its efficient time frequency localizations and multi-resolution characteristics. Also SVM is well suitable for addressing speaker identification task. Recognition accuracy of 99.77% is obtained whereas real time recognition accuracy of 84.66% is obtained in identical condition using this hybrid architecture of WPD and SVM. In noisy conditions recognition accuracy of 60% is obtained.

Speaker Recognition Using Discrete Wavelet Transform and Artificial Neural Networks

2016

In recent years biometrics has emerged as applied scientific discipline with the objective of automatically capturing personal identifying characteristics that distinguish one individual from another and using the measurements for security, surveillance, and forensic application. Speaker recognition is the process of automatically recognizing who is speaking based on individual information included in the speech waves. This paper presents the speaker identification method based on Discrete Wavelet Transform (DWT) and Artificial Neural Networks (ANN). In this study the DWT is used to extract a speaker's discriminative features from the mathematical representation of the speech signal. These feature vectors are used to train a feedforward neural network which is used to model the speakers and make the decision task. A database of 20 speakers (10 male and 10 female) has been used with a vocabulary of Kurdish words. The system led to 100% identification rate for text-dependent and 8...

Comparison of clustering algorithms in speaker identification

2000

In speaker identification, we match a given (unkown) speaker to the set of known speakers in a database. The database is constructed from the speech samples of each known speaker. Feature vectors are extracted from the samples by short-term spectral analysis, and processed further by vector quantization for locating the clusters in the feature space. We study the role of the vector quantization in the speaker identification system. We compare the performance of different clustering algorithms, and the influence of the codebook size. We want to find out, which method provides the best clustering result, and whether the difference in quality contribute to improvement in recognition accuracy of the system.

Discrete wavelet transform for automatic speaker recognition

2010

This paper deals with automatic speaker recognition. We consider here a context independent speaker recognition task with a closed set of speakers. We have shown in [1] a comparative study about the most frequently used parametrization/classification methods for the Czech language. Wavelet Transform (WT) is a modern parametrization method successfully used for some signal processing tasks. WT often outperforms parametrizations based on Fourier Transform, due to its capability to represent the signal precisely, in both frequency and time domains. The main goal of this paper is thus to use and evaluate several Wavelet Transforms instead of the conventional parametrizations that were used previously as a parametrization method of automatic speaker recognition. All experiments are performed on two Czech speaker corpora that contain speech of ten and fifty Czech native speakers, respectively. Three discrete wavelet families with different number of coefficients have been used and evaluated: Daubechies, Symlets and Coiflets with two classifiers: Gaussian Mixture Model (GMM) and Multi-Layer Perceptron (MLP). We show that recognition accuracy of wavelet parametrizations is very good and sometimes outperform the best parametrizations that were presented in our previous work.

IJERT-A Study of Various Speech Features and Classifiers used in Speaker Identification

International Journal of Engineering Research and Technology (IJERT)`, 2016

https://www.ijert.org/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification https://www.ijert.org/research/a-study-of-various-speech-features-and-classifiers-used-in-speaker-identification-IJERTV5IS020637.pdf Speech processing consists of analysis/synthesis, recognition & coding of speech signal. The recognition field further branched to Speech recognition, Speaker recognition and speaker identification. Speaker identification system is used to identify a speaker among many speakers. To have a good identification rate is a prerequisite for any Speaker identification system which can be achieved by making an optimal choice among the available techniques. In this paper, different speech features & extraction techniques such as MFCC, LPCC, LPC, GLFCC, PLPC etc and different features classification models such as VQ, GMM, DTW, HMM and ANN for speaker identification system have been discussed. Keywords-Linear Predictive Cepstral Coefficients (LPCC), Mel Frequency Cepstral Coefficients (MFCC), Gaussian Mixture Model (GMM), Vector Quantization (VQ), Hidden Markov Model (HMM), Artificial Neural Network (ANN)

Speaker Recognition – Wavelet Packet Based Multiresolution Feature Extraction Approach

2017

This paper proposes a novel Wavelet Packet based feature extraction approach for the task of text independent speaker recognition. The features are extracted by using the combination of Mel Frequency Cepstral Coefficient (MFCC) and Wavelet Packet Transform (WPT).Hybrid Features technique uses the advantage of human ear simulation offered by MFCC combining it with multi-resolution property and noise robustness of WPT. To check the validity of the proposed approach for the text independent speaker identification and verification we have used the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) respectively as the classifiers. The proposed paradigm is tested on voxforge speech corpus and CSTR US KED Timit database. The paradigm is also evaluated after adding standard noise signal at different level of SNRs for evaluating the noise robustness. Experimental results show that better results are achieved for the tasks of both speaker identification as well as speaker verification.