Feature Extraction Methods LPC, PLP and MFCC (original) (raw)

REVIEW OF SPEECH AND SPEECH RECOGNITION SYSTEM USING FEATURE EXTRACTION ALGORITHM AND OPTIMIZATION ALGORITHMS

This paper analyses the audio inconsistency of speakers and its impact on the strength of existing automatic speech recognition and speaker recognition systems. The acoustic and visual features are evaluated by a Support Vector Machine for digit and speaker detection and later by Hidden Markov Model verification. A methodology for speech recognition with speaker recognition based on Hidden Markov Model for security is a requirement of science. Mapping of speech using Artificial Neural networks is obtainable. Fireflies create glowing flash as a sign scheme to correspond with additional fireflies particularly to prey attractions. Cuckoo search algorithm is used as its search space is extensive in nature. Genetic algorithm can shun calculating system slope in traditional gap investigation and determines the optimum interval range of the parameters under acceptable corresponding aim error boundary. In order to obtain to obtain the most efficient and linearly discriminative components, LDA is used.

A Comparative Study of Feature Extraction Techniques for Speech Recognition System

The automatic recognition of speech means enabling a natural and easy mode of communication between human and machine. Speech processing has vast applications in voice dialing, telephone communication, call routing, domestic appliances control, Speech to Text conversion, Text to Speech conversion, lip synchronization, automation systems etc. Here we have discussed some mostly used feature extraction techniques like Mel frequency Cepstral Co-efficient (MFCC), Linear Predictive Coding (LPC) Analysis, Dynamic Time Wrapping (DTW), Relative Spectra Processing (RASTA) and Zero Crossings with Peak Amplitudes (ZCPA).Some parameters like RASTA and MFCC considers the nature of speech while it extracts the features, while LPC predicts the future features based on previous features.

Empirical Review Paper on Voice Recognition & Feature Extraction Techniques

Speech recognition is the process of automatically recognizing the spoken words of person based on information content in speech signal. Many reviews and surveys have been conducted on voice feature extraction techniques but most of them have not done an exhaustive empirical review on the techniques. This paper provides an empirical review with relevant algorithmic calculations on each of feature extraction techniques for voice recognition and discusses the techniques and systems that make it possible for computers to accept Voice as input. This paper shows the major developments in the field of voice analytics. It gives a detailed information of the three main feature extraction techniques: Linear Predictive Coding (LPC), Mel-frequency cepstrum coefficient (MFCCs) and RASTA filtering technique. The objective of this paper is to summarize the feature extractions techniques used in speech recognition system and provide an empirical value to each technique. The words " voice " and " speech " are used interchangeably in this context.

Speech Recognition using Dynamic Time Warping, Hidden Markov Model and Artificial Neural Networks.pdf

In this paper, an advanced method is presented that's able to classify speech signals with the high accuracy at the minimum time. First, the recorded signal is preprocessed that this section includes denoising with Mels Frequency Cepstral Analysis and feature extraction using discrete wavelet transform coefficients; Then these features are fed to Multilayer Perceptron network for classification. Finally, after training of neural network effective features are selected with UTA algorithm.

Speech Feature Extraction and Matching Technique

2016

The ultimate goal of the present investigation is to study the speech coding techniques for better understanding the natural spoken language considering the obvious constraints such as speaker dependency, isolated words, limited vocabulary and artificial grammar. The speech communication technology between human and computer is experiencing a revolutionary progress in the information industry. For analysis, synthesis, coding and recognition purpose the speech signals have to be converted into the digital form. The speech signals are continuous time and amplitude waveforms, which are then sampled and quantized. In the present work, two speech coding techniques have been used, the linear predictive coding technique for feature extraction. Dynamic Time Warping is a cost minimization matching technique, in which a test signal is stretched or compressed according to a reference template.

International Journal of Innovative Research in Computer and Communication Engineering Speech Recognition System with Different Methods of Feature Extraction

The paper presents the design of speech recognition system that uses preprocessing, feature extraction and classification stages. In preprocessing stage a de-noising is done to get the speech data without noise. In feature extraction stage Linear Predictive Coding (LPC), Mel Frequency Cepstral Coefficients (MFCC), and Spectrogram methods are used to extract the features of the word. Neural Networks (NN) was used to classify the spoken words to different patterns so the system can recognize unknown spoken words according to these patterns. The set of spoken words are used in simulation of the system. The comparative results of the system have been provided using above mentioned feature extraction methods.

A Review on Speech Feature Techniques and Classification Techniques

International Journal of Trend in Scientific Research and Development, 2018

Speech Processing method is one of the important method used in application area of digital and analog signal processing. It is used in real world speech processing of human language such as human computer interface system for home, industry and medical field. It is the most common means of the communication because the information contains the fundamental role in conversation. From the speech or conversation, it converts an acoustic signal that is captured by a microphone or a telephone, to a set of words. Various fields for research in speech processing are speech recognition, speaker recognition, speech synthesis, speech coding etc. Speech recognition is the process of automatically recognizing the spoken words of person based on information content in speech signal. The introduces a brief detail study on Automatic Speech Recognition and discusses the various classification techniques that have been accomplished in this wide area of speech processing. The objective of this paper is to study some of the well known methods that are widely used in several stages of speech recognition system.

Stand-Alone Intelligent Voice Recognition System

Journal of Signal and Information Processing, 2014

In this paper, an expert system for security based on biometric human features that can be obtained without any contact with the registering sensor is presented. These features are extracted from human's voice, so the system is called Voice Recognition System (VRS). The proposed system consists of a combination of three stages: signal pre-processing, features extraction by using Wavelet Packet Transform (WPT) and features matching by using Artificial Neural Networks (ANNs). The features vectors are formed after two steps: firstly, decomposing the speech signal at level 7 with Daubechies 20-tap (db20), secondly, the energy corresponding to each WPT node is calculated which collected to form a features vector. One hundred twenty eight features vector for each speaker was fed to the Feed Forward Back-propagation Neural Network (FFBPNN). The data used in this paper are drawn from the English Language Speech Database for Speaker Recognition (ELSDSR) database which composes of audio files for training and other files for testing. The performance of the proposed system is evaluated by using the test files. Our results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database. The proposed method showed efficiency results were better than the well-known Mel Frequency Cepstral Coefficient (MFCC) and the Zak transform.

DIFFERENT FEATURE EXTRACTION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION: A REVIEW

Automatic speech recognition, which allows a usual and user-friendly communication technique among individual and device, is a dynamic research area. The speech recognition is the skill to pay attention to what we are talking about, to interpret and to perform actions based on the information spoken. This article presents a short outline of speech recognition and the various techniques like MFCC, LPC and PLP intended for feature extraction in speech recognition system. Among the three techniques i.e. MFCC, LPC, PLP, Mel frequency cepstral coefficient's (MFCC) is repeatedly used feature extraction technique in speech recognition process because it is most nearby to the real individual acoustic speech opinion.

Analysis of speech recognition techniques

INTERNATIONAL JOURNAL OF ADVANCE RESEARCH, IDEAS AND INNOVATIONS IN TECHNOLOGY

This paper focuses on speech recognition techniques such as LPC (linear predictive coding), MFCC (Mel-frequency Cepstral coefficients) with Hidden Markov Models, LPCC (linear predictive Cepstral coding), and RASTA and will compare these techniques to find a most accurate and efficient way to recognize speech. Speech recognition is the process in which program or machine do the identification of words or phrases and convert them to machine-readable format. Additionally, this paper also focuses on NLP (natural language processing) techniques used with the speech recognition process. Once the speech signal is converted to text then NLP is used to understand and generate what has been said. NLU (natural language understanding) and NLG (natural language generation) are two important steps in NLP, through this paper, we will compare and analysis techniques to find out which we can use with speech recognition for effective results. The Objective of this paper is to find out the best technique which is currently used.