INESC Porto (original) (raw)

Additional evidence that common low-level features of individual audio frames are not representative of music genre

2010

The Bag-of-Frames (BoF) approach has been widely used in music genre classification. In this approach, music genres are represented by statistical models of low-level features computed on short frames (e.g. in the tenth of ms) of audio signal. In the design of such models, a common procedure in BoF approaches is to represent each music genre by sets of instances (i.e. frame-based feature vectors) inferred from training data. The common underlying assumption is that the majority of such instances do capture somehow the (musical) specificities of each genre, and that obtaining good classification performance is a matter of size of the training dataset, and fine-tuning feature extraction and learning algorithm parameters. We report on extensive tests on two music databases that contradict this assumption. We show that there is little or no benefit in seeking a thorough representation of the feature vectors for each class. In particular, we show that genre classification performances are similar when representing music pieces from a number of different genres with the same set of symbols derived from a single genre or from all the genres. We conclude that our experiments provide additional evidence to the hypothesis that common low-level features of isolated audio frames are not representative of music genres.

Short-term Feature Space and Music Genre Classification

Journal of New Music Research, 2011

In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.

Automatic genre classification as a study of the viability of high-level features for music classification

Proceedings of the International Computer Music …, 2004

This paper examines the potential of high-level features extracted from symbolic musical representations in regards to musical classification. Twenty features are implemented and tested by using them to classify 225 MIDI files by genre. This system differs from previous automatic genre classification systems, which have focused on low-level features extracted from audio data. Files are classified into three parent genres and nine sub-genres, with average success rates of 84.8% for the former and 57.8% for the latter. Classification is performed by a novel configuration of feed-forward neural networks that independently classify files by parent genre and sub-genre and combine the results using weighted averages.

Automatic genre classification using large high-level musical feature sets

2004

This paper presents an automatic description system of drum sounds for real-world musical audio signals. Our system can represent onset times and names of drums by means of drum descriptors defined in the context of MPEG-7. For their automatic description, drum sounds must be identified in such polyphonic signals. The problem is that acoustic features of drum sounds vary with each musical piece and precise templates for them cannot be prepared in advance. To solve this problem, we propose new template-adaptation and template-matching methods. The former method adapts a single seed template prepared for each kind of drums to the corresponding drum sound appearing in an actual musical piece. The latter method then can detect all the onsets of each drum by using the corresponding adapted template. The onsets of bass and snare drums in any piece can thus be identified. Experimental results showed that the accuracy of identifying bass and snare drums in popular music was about 90%. Finally, we define drum descriptors in the MPEG-7 format and demonstrate an example of the automatic drum sound description for a piece of popular music.

Audio feature engineering for automatic music genre classification

2007

The scenarios opened by the increasing availability, sharing and dissemination of music across the Web is pushing for fast, effective and abstract ways of organizing and retrieving music material. Automatic classification is a central activity to model most of these processes, thus its design plays a relevant role in advanced Music Information Retrieval. In this paper, we adopted a state-of-the-art machine learning algorithm, i.e. Support Vector Machines, to design an automatic classifier of music genres. In order to optimize classification accuracy, we implemented some already proposed features and engineered new ones to capture aspects of songs that have been neglected in previous studies. The classification results on two datasets suggest that our model based on very simple features reaches the state-of-art accuracy (on the ISMIR dataset) and very high performance on a music corpus collected locally.

Selection of training instances for music genre classification

2010

In this paper we present a method for the selection of training instances based on the classification accuracy of a SVM classifier. The instances consist of feature vectors representing short-term, low-level characteristics of music audio signals. The objective is to build, from only a portion of the training data, a music genre classifier with at least similar performance as when the whole data is used. The particularity of our approach lies in a pre-classification of instances prior to the main classifier training: i.e. we select from the training data those instances that show better discrimination with respect to class memberships. On a very challenging dataset of 900 music pieces divided among 10 music genres, the instance selection method slightly improves the music genre classification in 2.4 percentage points. On the other hand, the resulting classification model is significantly reduced, permitting much faster classification over test data.

Feature Selection in Automatic Music Genre Classification

2008 Tenth IEEE International Symposium on Multimedia, 2008

This paper presents the results of the application of a feature selection procedure to an automatic music genre classification system. The classification system is based on the use of multiple feature vectors and an ensemble approach, according to time and space decomposition strategies. Feature vectors are extracted from music segments from the beginning, middle and end of the original music signal (timedecomposition). Despite being music genre classification a multi-class problem, we accomplish the task using a combination of binary classifiers, whose results are merged in order to produce the final music genre label (space decomposition). As individual classifiers several machine learning algorithms were employed: Naïve-Bayes, Decision Trees, Support Vector Machines and Multi-Layer Perceptron Neural Nets. Experiments were carried out on a novel dataset called Latin Music Database, which contains 3,227 music pieces categorized in 10 musical genres. The experimental results show that the employed features have different importance according to the part of the music signal from where the feature vectors were extracted. Furthermore, the ensemble approach provides better results than the individual segments in most cases.

Musical genre classification of audio signals

Speech and Audio Processing, IEEE …, 2002

Musical genres are categorical labels created by humans to characterize pieces of music. A musical genre is characterized by the common characteristics shared by its members. These characteristics typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. Genre hierarchies are commonly used to structure the large collections of music available on the Web. Currently musical genre annotation is performed manually. Automatic musical genre classification can assist or replace the human user in this process and would be a valuable addition to music information retrieval systems. In addition, automatic musical genre classification provides a framework for developing and evaluating features for any type of content-based analysis of musical signals.

Factors in automatic musical genre classification of audio signals

Automatic musical genre classification is an important tool for organizing the large collections of music that are becoming available to the average user. In addition, it provides a structured way of evaluating musical content features that does not require extensive user studies. The paper provides a detailed comparative analysis of various factors affecting automatic classification performance, such as choice of features and classifiers. Using recent machine learning techniques, such as support vector machines, we improve on previously published results using identical data collections and features.

Music Genre Classification with the Million Song Dataset

2011

The field of Music Information Retrieval (MIR) draws from musicology, signal processing, and artificial intelligence. A long line of work addresses problems including: music understanding (extract the musically-meaningful information from audio waveforms), automatic music annotation (measuring song and artist similarity), and other problems. However, very little work has scaled to commercially sized data sets. The algorithms and data are both complex.

INESC Porto (original) (raw)

Related papers