Automatic Sound Classification Inspired by Auditory Scene Analysis (original) (raw)

Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis

Eurasip Journal on Advances in Signal Processing - EURASIP J ADV SIGNAL PROCESS, 2005

A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.

Studies and Improvements in Automatic Classification of Musical Sound Samples

2003

In this article we shall deal with automatic classification of sound samples and ways to improve the classification results: We describe a classification process which produces high classification success percentage (over 95% for musical instruments) and compare the results of three classification algorithms: Multidimensional Gauss, KNN and LVQ. Next, we introduce several algorithms to improve the sound database self-consistency by removing outliers: LOO, IQR and MIQR. We present our efficient process for Gradual Elimination of Descriptors using Discriminant Analysis (GDE) which improves a previous descriptor selection algorithm . It also enables us to reduce the computation complexity and space requirements of a sound classification process according to specific accuracy needs. Moreover, it allows finding the dominant separating characteristics of the sound samples in a database according to classification taxonomy. The article ends by showing that good classification results do not necessarily mean generalized recognition of the dominant sound source characteristics, but the classifier might actually be focused on the specific attributes of the classified database. By enriching the learning database with diverse samples from other databases we obtain a more general classifier. The dominant descriptors provided by GDE are then more closely related to what is supposed to be the distinctive characteristics of the sound sources. 1

An Expert System for Automatic Classification of Sound Signals, Journal of Telecommunications and Information Technology, 2020, nr 2

In this paper, we present the results of research focusing on methods for recognition/classification of audio signals. We consider the results of the research project to serve as a basis for the main module of a hybrid expert system currently under development. In our earlier studies, we conducted research on the effectiveness of three classifiers: fuzzy classifier, neural classifier and WEKA system for reference data. In this project, a particular emphasis was placed on fine-tuning the fuzzy classifier model and on identifying neural classifier applications, taking into account new neural networks that we have not studied so far in connection with sounds classification method

A COMPARISON OF TECHNIQUES FOR AUDIO ENVIRONMENT CLASSIFICATION

Excessive background noise is one of the most common complaints from hearing aid users. Background noise classification systems can be used in hearing aids to adjust the response based on the noise environment. This paper examines and compares several classification techniques in the form of the k-nearest neighbours (K-NN)classifier, the non-windowed artificial neural network (ANN) and the hidden Markov model (HMM), to an artificial neural network using windowed input (WANN). Results obtained indicate that the WANN gives an accuracy of up to 97.9%, which is the highest accuracy of the tested classifiers. The memory and computational requirements of the windowed ANN are also small compared to the HMM and K-NN. Overall, the WANN is able to give excellent accuracy and reliability and is considered to be a good choice for background noise classification in hearing aids.

Problems with Automatic Classification of Musical Sounds

Intelligent Information Processing and Web Mining, 2003

Convenient searching of multimedia databases requires well annotated data. Labeling sound data with information like pitch or timbre must be done through sound analysis. In this paper, we deal with the problem of automatic classi cation of musical instrument on the basis of its sound. Although there are algorithms for basic sound descriptors extraction, correct identi cation of instrument still poses a problem. We describe di culties encountered when classifying woodwinds, brass, and strings of contemporary orchestra. We discuss most di cult cases and explain why these sounds cause problems. The conclusions are drawn and presented in brief summary closing the paper.

Sound Classification Using Python

ITM Web of Conferences, 2021

Sound assumes a significant part in human existence. It is one of the fundamental tangible data which we get or see from the climate and their components which have three principal credits viz. Sufficiency (Loudness of the sound), Frequency (The pitch of the sound), Timbre (Quality of the sound or the personality of the sound for example the Sound contrast between a piano and a violin). It is an event generated from the action. Humans are highly efficient to learn and recognize new and various types of sounds and sound events. There is a lot of research work going on Automatic sound classification and it is used in various real-world applications. The paper proposes an examination of an establishment disturbance classifier reliant upon a model affirmation approach using a neural organization. The signs submitted to the neural association are depicted through a lot of 12 MFCC (Mel Frequency Cepstral Coefficient) limits routinely present toward the front finish of an adaptable termina...

Evaluation of sound classification algorithms for hearing aid applications

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010

Automatic program switching has been shown to be greatly beneficial for hearing aid users. This feature is mediated by a sound classification system, which is traditionally implemented using simple features and heuristic classification schemes, resulting in an unsatisfactory performance in complex auditory scenarios. In this study, a number of experiments are conducted to systematically assess the impact of more sophisticated classifiers and features on automatic acoustic environment classification performance. The results show that advanced classifiers, such as Hidden Markov Model (HMM) or Gaussian Mixture Model (GMM), greatly improve classification performance over simple classifiers. This change does not require a great increase of computational complexity, provided that a suitable number (5 to 7) of low-level features are carefully chosen. These findings indicate that advanced classifiers can be feasible in hearing aid applications.

An Expert System for Automatic Classification of Sound Signals

Journal of Telecommunications and Information Technology, 2020

In this paper, we present the results of research focusing on methods for recognition/classification of audio signals. We consider the results of the research project to serve as a basis for the main module of a hybrid expert system currently under development. In our earlier studies, we conducted research on the effectiveness of three classifiers: fuzzy classifier, neural classifier and WEKA system for reference data. In this project, a particular emphasis was placed on fine-tuning the fuzzy classifier model and on identifying neural classifier applications, taking into account new neural networks that we have not studied so far in connection with sounds classification methods

Hybrid sound classification

2015

We posit a classification of sounds useful for studies of sound recognition and identification that accounts for both signal properties (source sound characteristics) and human perception (sound uses). This classification is split into four main branches: (1) systemic (speech and music) sounds, (2) environmental sounds, (3) warning sounds, and (4) animal sounds. We describe the differences between each in terms of criteria related to perception, production and goal. We outline the advantages of our classification, which considers the use of a sound within the context of a communication act, for example, within linguistics; or in harmonics, for musicology. Considering a sound both as a set of acoustic characteristics perceived by a human, and as having particular uses determined by a human, this classification permits a meaningful approach to the study of sound from object-and human-centered perspectives. PACS no. 43.60.+d,43.90.+v, 43.75.Cd

A computational model of auditory feature extraction and sound classification

2005

Models of auditory processing, particularly of speech, face many difficulties. Included in these are variability among speakers, variability in speech rate, and robustness to moderate distortions such as time compression. We constructed a system based on ensembles of feature detectors derived from fragments of an onset-sensitive sound representation. This method is based on the idea of *spectro-temporal response fields' and uses convolution to measure the degree of similarity through time between the feature detectors and the stimulus. The output from the ensemble was used to derive segmentation cues and patterns of response, which were used to train an artificial neural network (ANN) classifier. This allowed us to estimate a lower bound for the mutual information between the class of the input and the class of the output. Our results suggest that there is significant information in the output of our system, and that this is robust with respect to the exact choice of feature set...