Audio classification utilizing a rule-based approach and the support vector machine classifier (original) (raw)
Related papers
Classification of Audio Data using Support Vector Machine
2011
Audio mining is to extract audio signals for indicating patterns and features of audio data to get data mining results. Various audio features like Mel frequency Cepstral Coefficient (MFCC), Linear Predictive Coefficient (LPC), Compactness, Spectral Flux (SF), Band Periodicity (BP), Zero Crossing Rate (ZCR) etc are used to classify audio data into various classes. Various classification algorithms such as Naive Bayes, FT, J48, ID3 and LibSVM are used to classify audio data into defined classes. Using various performance parameters such as True Positive (TP) Rate, False Positive (FP) Rate etc., results of various classification algorithms are compared.
An Efficient Audio Classification Approach Based on Support Vector Machines
International Journal of Advanced Computer Science and Applications, 2016
In order to achieve an audio classification aimed to identify the composer, the use of adequate and relevant features is important to improve performance especially when the classification algorithm is based on support vector machines. As opposed to conventional approaches that often use timbral features based on a time-frequency representation of the musical signal using constant window, this paper deals with a new audio classification method which improves the features extraction according the Constant Q Transform (CQT) approach and includes original audio features related to the musical context in which the notes appear. The enhancement done by this work is also lay on the proposal of an optimal features selection procedure which combines filter and wrapper strategies. Experimental results show the accuracy and efficiency of the adopted approach in the binary classification as well as in the multi-class classification.
Over the last few years exceeding efforts have been made to develop methods for extracting information from audiovisual media, mandate that they may be stored and retrieved in databases automatically. Audio classification serves as the fundamental step towards the quickly growth in audio data volume. Automatic audio classification is very useful in content based audio retrieval and online audio distribution. The accuracy of the classification relies on the efficacy of the features and classification scheme. In this work both, time domain and frequency domain features are extracted from the input signal. Time domain feature is Root Mean Square (RMS). Frequency domain feature is spectral flux. After feature extraction, classification will be. The selection of the important features is explained as well as the classifiers used for classification are compared.
Content-based audio classification and segmentation by using support vector machines
Multimedia Systems, 2003
Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered in this paper: silence, music, background sound, pure speech, and non-pure speech which includes speech over music and speech over noise. A sound stream is segmented by classifying each sub-segment into one of these five classes. We have evaluated the performance of SVM on different audio type-pairs classification with testing unit of different-length and compared the performance of SVM, K-Nearest Neighbor (KNN), and Gaussian Mixture Model (GMM). We also evaluated the effectiveness of some new proposed features. Experiments on a database composed of about 4-hour audio data show that the proposed classifier is very efficient on audio classification and segmentation. It also shows the accuracy of the SVM-based method is much better than the method based on KNN and GMM.
Feature Analysis for Audio Classification
Lecture Notes in Computer Science, 2014
In this work we analyze and implement several audio features. We emphasize our analysis on the ZCR feature and propose a modification making it more robust when signals are near zero. They are all used to discriminate the following audio classes: music, speech, environmental sound. An SVM classifier is used as a classification tool, which has proven to be efficient for audio classification. By means of a selection heuristic we draw conclusions of how they may be combined for fast classification.
SVM binary decision tree architecture for multi-class audio classification
ELMAR, 2012 Proceedings, 2012
The paper presents the support vector machine binary decision tree scheme (SVM-BDT) used for broadcast news (BN) audio classification. The SVM-BDT architecture was designed to solve multi-class discrimination problem of considered acoustic events: pure speech, speech with music, speech with environment sound, music, and environment sound. Its performance was investigated by using Mel-frequency cepstral coefficients (MFCCs), as a powerful signal parameterization technique, for each SVM binary classifier. The one-against-all strategy in combination with Euclidean distance algorithm was implemented in discrimination process, in order to decrease the influence of missclassification between each class.
Mixed type audio classification with support vector machine
2006
Abstract Content-based classification of audio data is an important problem for various applications such as overall analysis of audio-visual streams, boundary detection of video story segment, extraction of speech segments from video, and content-based video retrieval. Though the classification of audio into single type such as music, speech, environmental sound and silence is well studied, classification of mixed type audio data, such as clips having speech with music as background, is still considered a difficult problem.
An efficient use of support vector machines for speech signal classification
Proceedings of the 8th …, 2009
In this paper, we are proposing a new classifier called support vector machines to classify the speech signals. We have achieved very good generalization performance by implementing the support vector machines with various kernel functions. The use One-vs-One Classifier with voting algorithm improves speech signal classification systems efficiency.
Audio classification and categorization based on wavelets and support vector Machine
IEEE Transactions on Speech and Audio Processing, 2005
In this paper, an improved audio classification and categorization technique is presented. This technique makes use of wavelets and support vector machines (SVMs) to accurately classify and categorize audio data. When a query audio is given, wavelets are first applied to extract acoustical features such as subband power and pitch information. Then, the proposed method uses a bottom-up SVM over these acoustical features and additional parameters, such as frequency cepstral coefficients, to accomplish audio classification and categorization. A public audio database (Muscle Fish), which consists of 410 sounds in 16 classes, is used to evaluate the performances of the proposed method against other similar schemes. Experimental results show that the classification errors are reduced from 16 (8.1%) to six (3.0%), and the categorization accuracy of a given audio sound can achieve 100% in the Top 2 matches.
Classification of audio signals using SVM and RBFNN
Expert Systems with Applications, 2009
In the age of digital information, audio data has become an important part in many modern computer applications. Audio classification has been becoming a focus in the research of audio processing and pattern recognition. Automatic audio classification is very useful to audio indexing, content-based audio retrieval and on-line audio distribution, but it is a challenge to extract the most common and salient themes from unstructured raw audio data. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. Support vector machines are applied to classify audio into their respective classes by learning from training data. Then the proposed method extends the application of neural network (RBFNN) for the classification of audio. RBFNN enables nonlinear transformation followed by linear transformation to achieve a higher dimension in the hidden space. The experiments on different genres of the various categories illustrate the results of classification are significant and effective.