Kolmogorov-Smirnov test for feature selection in emotion recognition from speech (original) (raw)

Segment-based approach to the recognition of emotions in speech

2005

A new framework for the context and speaker independent recognition of emotions from voice, based on a richer and more natural representation of the speech signal, is proposed. The utterance is viewed as consisting of a series of voiced segments and not as a single object. The voiced segments are first identified and then described using statistical measures of spectral shape, intensity, and pitch contours, calculated at both the segment and the utterance level. Utterance classification is performed by combining the segment classification decisions using a fixed combination scheme. The performance of two learning algorithms, Support Vector Machines and K Nearest Neighbors, is compared. The proposed approach yields an overall classification accuracy of 87% for 5 emotions, outperforming previous results on a similar database.

Enhancing Emotion Recognition from Speech through Feature Selection

2010

In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion Challenge, we studied the relative importance of the individual speech parameters, and based on their ranking, a subset of speech parameters that offered advantageous performance was selected. The affect-emotion recognizer utilized here relies on a GMM-UBM-based classifier. In all experiments, we followed the experimental setup defined by the Interspeech 2009 Emotion Challenge, utilizing the FAU Aibo Emotion Corpus of spontaneous, emotionally coloured speech. The experimental results indicate that the correct choice of the speech parameters can lead to better performance than the baseline one.

Acoustic feature selection for automatic emotion recognition from speech

Information Processing and Management, 2009

Emotional expression and understanding are normal instincts of human beings, but automatical emotion recognition from speech without referring any language or linguistic information remains an unclosed problem. The limited size of existing emotional data samples, and the relative higher dimensionality have outstripped many dimensionality reduction and feature selection algorithms. This paper focuses on the data preprocessing techniques which aim to extract the most effective acoustic features to improve the performance of the emotion recognition. A novel algorithm is presented in this paper, which can be applied on a small sized data set with a high number of features. The presented algorithm integrates the advantages from a decision tree method and the random forest ensemble. Experiment results on a series of Chinese emotional speech data sets indicate that the presented algorithm can achieve improved results on emotional recognition, and outperform the commonly used Principle Component Analysis (PCA)/Multi-Dimensional Scaling (MDS) methods, and the more recently developed ISOMap dimensionality reduction method.

Application of Feature Subset Selection Based on Evolutionary Algorithms for Automatic Emotion Recognition in Speech

Lecture Notes in Computer Science, 2007

The study of emotions in human-computer interaction is a growing research area. Focusing on automatic emotion recognition, work is being performed in order to achieve good results particularly in speech and facial gesture recognition. In this paper we present a study performed to analyze different machine learning techniques validity in automatic speech emotion recognition area. Using a bilingual affective database, different speech parameters have been calculated for each audio recording. Then, several machine learning techniques have been applied to evaluate their usefulness in speech emotion recognition. In this particular case, techniques based on evolutive algorithms (EDA) have been used to select speech feature subsets that optimize automatic emotion recognition success rate. Achieved experimental results show a representative increase in the abovementioned success rate.

A Comparison Using Different Speech Parameters in the Automatic Emotion Recognition Using Feature Subset Selection Based on Evolutionary Algorithms

Lecture Notes in Computer Science, 2007

Study of emotions in human-computer interaction is a growing research area. Focusing on automatic emotion recognition, work is being performed in order to achieve good results particularly in speech and facial gesture recognition. This paper presents a study where, using a wide range of speech parameters, improvement in emotion recognition rates is analyzed. Using an emotional multimodal bilingual database for Spanish and Basque, emotion recognition rates in speech have significantly improved for both languages comparing with previous studies. In this particular case, as in previous studies, machine learning techniques based on evolutive algorithms (EDA) have proven to be the best emotion recognition rate optimizers.

An evolutionary optimization method for selecting features for speech emotion recognition

TELKOMNIKA, 2023

Human-computer interactions benefit greatly from emotion recognition from speech. To promote a contact-free environment in this coronavirus disease 2019 (COVID’19) pandemic situation, most digitally based systems used speech-based devices. Consequently, this emotion detection from speech has many beneficial applications for pathology. The vast majority of speech emotion recognition (SER) systems are designed based on machine learning or deep learning models. Therefore, need greater computing power and requirements. This issue was addressed by developing traditional algorithms for feature selection. Recent research has shown that nature-inspired or evolutionary algorithms such as equilibrium optimization (EO) and cuckoo search (CS) based meta-heuristic approaches are superior to the traditional feature selection (FS) models in terms of recognition performance. The purpose of this study is to investigate the impact of feature selection meta-heuristic approaches on emotion recognition from speech. To achieve this, we selected the rayerson audio-visual database of emotional speech and song (RAVDESS) database and obtained maximum recognition accuracy of 89.64% using the EO algorithm and 92.71% using the CS algorithm. For this final step, we plotted the associated precision and F1 score for each of the emotional classes.

Upgrading the Performance of Speech Emotion Recognition at the Segmental Level

IOSR Journal of Computer Engineering, 2013

This paper presents an efficient approach for maximizing the accuracy of automatic speech emotion recognition in English, using minimal inputs, minimal features, lesser algorithmic complexity and reduced processing time. Whereas the findings reported here are based on the exclusive use of vowel formants, most of the related previous works used tens or even hundreds of other features. In spite of using a greater level of signal processing, the recognition accuracy reported earlier was often lesser than that obtained by our approach. This method is based on vowel utterances and the first step comprises statistical pre-processing of the vowel formants. This is followed by the identification of the best formants using the KMeans, K-nearest neighbor and Naive Bayes classifiers. The Artificial neural network that was used for the final classification gave an accuracy of 95.6% on elicited emotional speech. Nearly 1500 speech files from ten female speakers in the neutral and six basic emotions were used to prove the efficiency of the proposed approach. Such a result has not been reported earlier for English and is of significance to researchers, sociologists and others interested in speech.

Optimal Feature for Emotion Recognition from Speech

2016

Emotion recognition plays an important role in various applications like from content based digital library search to psychological analysis. The motivation of this paper is to improve the quality of speech emotion recognition by separating the emotions clearly using the speech features with appropriate classifiers. At the time of finding the emotions from speech after feature extraction and model generation, the acoustical resemblance of certain emotions like happiness/angry, sad/boredom produce high correlations/likelihood so that they can’t make it as unique while extracting the features. In this paper, the classification of seven emotional states anger, happy, sad, neutral, fear, disgust and boredom using the optimal features is selected by the Sequential Forward Selection (SFS) algorithm and Gaussian Mixture Model (GMM) classifier. Short term energy, MFCC and pitch contour are the features that has been considered in this work. Selecting the optimal features is used for recogni...

A hierarchical approach with feature selection for emotion recognition from speech

2012

We examine speaker independent emotion classification from speech, reporting experiments on the Berlin database across six basic emotions. Our approach is novel in a number of ways: First, it is hierarchical, motivated by our belief that the most suitable feature set for classification is different for each pair of emotions. Further, it uses a large number of feature sets of different types, such as prosodic, spectral, glottal flow based, and AM-FM ones. Finally, it employs a two-stage feature selection strategy to achieve discriminative dimensionality reduction. The approach results to a classification rate of 85%, comparable to the state-of-the-art on this dataset.