Carlos Guedes | New York University Abu Dhabi (original) (raw)
Phone: +971 2 628 5240
Address: New York University Abu Dhabi
PO Box 129188
Abu Dhabi, United Arab Emirates
less
Uploads
Papers by Carlos Guedes
Proceedings of the Digital Libraries for Musicology Workshop, 2019
Konstantinos Trochidis, Beth Russell, Andrew Eisenberg, Kaustuv Kanti Ganguli, Oscar Gomez, Chris... more Konstantinos Trochidis, Beth Russell, Andrew Eisenberg, Kaustuv Kanti Ganguli, Oscar Gomez, Christos Plachouras, Carlos Guedes, Virginia Danielson
This paper discusses an overview of an ongoing research that combines the preservation of musical heritage with ethnomusicological research driven by computational analysis. The two collections of non-Eurogenetic music under study are a curated collection of East African Swahili coast music and commercial recordings of Arab music. We explore the cross-cultural similarities, interactions and patterns of the music excerpts from the different regions and understand the similarities by employing computational audio analysis, machine learning and visualization techniques. We used a base-line model of representation by extracting Mel-Frequency Cepstral Coefficients (MFCC) features to model the spectral characteristics of the music excerpts in conjunction with t-distributed stochas-tic neighbor embedding (t-SNE) to create 2-D mappings of these features into a lower dimensional space of similarity. We compare this representation with more sophisticated approaches of acoustic feature representation. Principal Component Analysis (PCA) was used on the mel-scaled spectrograms of the music excerpts, and t-SNE was used to map the Principal Components to a 2-D space. The logarithmic short-time Fourier transform (STFT) of the music excerpts were extracted and a deep autoencoder neural network was trained to learn the relationships and structure of the excerpts by compressing the raw representation of the STFT. The results from the analysis show that PCA and the autoencoder model can reveal more interesting cluster representation than MFCCs by generating more complex clusters between the different styles of the corpus.