Sparse Coding for Efficient Bioacoustic Data Mining: Preliminary Application to Analysis of Whale Songs (original) (raw)
Related papers
2015
Efficient methods of biodiversity assessment and monitoring are central to ecological research and crucial in conservation management. Technological advances in remote acoustic sensing inspire new perspectives in ecology: environmental sound monitoring is emerging as a reliable non-invasive proxy for ecological complexity (Sueur and Farina, 2015). Rather than attempting to recognise species-specific calls, either manually or automatically, we are interested in monitoring the global acoustic environment, tackling the problem of diversity assessment at the community (rather than species) level. Preliminary work has attempted to make a case for community-level acoustic indices (e.g.
PeerJ, 2016
Passive acoustic monitoring is emerging as a promising non-invasive proxy for ecological complexity with potential as a tool for remote assessment and monitoring (Sueur & Farina, 2015). Rather than attempting to recognise species-specific calls, either manually or automatically, there is a growing interest in evaluating the global acoustic environment. Positioned within the conceptual framework of ecoacoustics, a growing number of indices have been proposed which aim to capture community-level dynamics by (e.g., Pieretti, Farina & Morri, 2011; Farina, 2014; Sueur et al., 2008b) by providing statistical summaries of the frequency or time domain signal. Although promising, the ecological relevance and efficacy as a monitoring tool of these indices is still unclear. In this paper we suggest that by virtue of operating in the time or frequency domain, existing indices are limited in their ability to access key structural information in the spectro-temporal domain. Alternative methods in...
Sparse representation for classification of dolphin whistles by type
The Journal of the Acoustical Society of America, 2014
A compressive-sensing approach called Sparse Representation Classifier (SRC) is applied to the classification of bottlenose dolphin whistles by type. The SRC algorithm constructs a dictionary of whistles from the collection of training whistles. In the classification phase, an unknown whistle is represented sparsely by a linear combination of the training whistles and then the call class can be determined with an l1-norm optimization procedure. Experimental studies conducted in this research reveal the advantages and limitations of the proposed method against some existing techniques such as K-Nearest Neighbors and Support Vector Machines in distinguishing different vocalizations.
Authorea (Authorea), 2022
1. Passive Acoustic Monitoring is emerging as a solution for monitoring species and environmental change over large spatial and temporal scales. However, drawing rigorous conclusions based on acoustic recordings is challenging, as there is no consensus over which approaches and indices are best suited for characterizing marine acoustic environments. 2. We present an alternative to the use of ecoacoustic indices and describe the application of multiple machine learning techniques to the analysis of a large PAM dataset. We combine pre-trained acoustic classification models, dimensionality reduction, and random forest algorithms to demonstrate how machine-learned acoustic features capture different aspects of the marine environment. We processed two PAM databases and conducted 13 trials showing how acoustic features can be used to: i) discriminate between the vocalizations of marine mammals, beginning with high-level taxonomic groups, and extending to detecting differences between conspecifics belonging to distinct populations; ii) discriminating amongst different marine environments; and iii) detecting and monitoring anthropogenic and biological sound sources. 3. Acoustic features and their UMAP projections exhibited good performance in the classification of marine mammal vocalizations. Most of the taxonomic levels investigated here could be classified using the UMAP projections, apart from species that were underrepresented. Both anthropogenic (ships and airguns) and biological (humpback whales) sound sources could also be identified in field recordings. 4. We argue that acoustic feature extraction, visualization, and analysis allows the retention of most of the environmental information contained in PAM recordings, overcoming the limitations encountered when using ecoacoustics indices. Acoustic features are universal, permitting comparisons of results collected from multiple environments. Our approach can be used to simultaneously investigate the macro and micro characteristics of marine soundscapes, with a more objective method and with far less human effort.
Development of a machine learning detector for North Atlantic humpback whale song
The Journal of the Acoustical Society of America/The journal of the Acoustical Society of America, 2024
The study of humpback whale song using passive acoustic monitoring devices requires bioacousticians to manually review hours of audio recordings to annotate the signals. To vastly reduce the time of manual annotation through automation, a machine learning model was developed. Convolutional neural networks have made major advances in the previous decade, leading to a wide range of applications, including the detection of frequency modulated vocalizations by cetaceans. A large dataset of over 60 000 audio segments of 4 s length is collected from the North Atlantic and used to fine-tune an existing model for humpback whale song detection in the North Pacific (see Allen, Harvey, Harrell, Jansen, Merkens, Wall, Cattiau, and Oleson (2021). Front. Mar. Sci. 8, 607321). Furthermore, different data augmentation techniques (time-shift, noise augmentation, and masking) are used to artificially increase the variability within the training set. Retraining and augmentation yield F-score values of 0.88 on context window basis and 0.89 on hourly basis with false positive rates of 0.05 on context window basis and 0.01 on hourly basis. If necessary, usage and retraining of the existing model is made convenient by a framework (AcoDet, acoustic detector) built during this project. Combining the tools provided by this framework could save researchers hours of manual annotation time and, thus, accelerate their research.
Automatic classification of humpback whale social calls
The Journal of the Acoustical Society of America
Acoustic methods are becoming increasingly common in the study of marine mammal populations and behavior. Automating the detection and classification of whale vocalizations has been a central aim of these methods. The focus has primarily been on intra-species detection and classification, however, humpback whale (Megaptera novaeangliae) social call detection and classification has largely remained a manual task in the bioacoustics community. To automate this process, we processed spectrograms of calls using PCA-based and connected-component-based methods, and derived features from relative power in the frequency bins of these spectrograms. We then used these features to train and test a supervised Hidden Markov Model (HMM) algorithm to investigate classification feasibility.
A Quantitative Technique to Compare and Classify Humpback Whale (Megaptera novaeangliae) Sounds
Ethology, 2010
In an attempt to minimize observer bias, numerical taxonomy methods were used to describe and classify humpback whale sounds. The spectrograms (N = 1255) were digitized into a 16 × 21 binary matrix. The rows were 16 frequencies selected on a logarithmic scale (0.12–8 kHz). The columns were 21 time samples taken every 0.1 s. Each point of the matrix was coded 1 if it lay over part of the sound. Other binary variables were added to code for relative intensity within a sound, frequency modulation and amplitude modulation. The sounds were then compared using the Jaccard similarity coefficient for binary data, and classified with average linkage cluster analysis. This technique produced 115 clusters, which were compared with my aural and visual impressions of the sounds. I agreed with most major categories identified by cluster analysis, but many small clusters had to be fused to other categories. This was partially due to the technique used, and to the complexity of the repertoire under study. Improvements are proposed to further reduce observer bias in classification of sounds, and thus make studies of animal communication performed by different researchers or on different species more easily comparable.
Efficient methods of biodiversity assessment and monitoring are central to ecological research and crucial in conservation management; technological advances in remote acoustic sensing inspire new approaches. In line with the emerging field of Soundscape Ecology (Pijanowski et al., 2011), the acoustic approach is based on the rationale that the ecological processes occurring within a landscape are tightly linked to and reflected in the high-level structure of the patterns of sounds emanating from those landscapes – the soundscape. Rather than attempting to recognise species-specific calls, either manually or automatically, analysis of the high-level structure of the soundscape tackles the problem of diversity assessment at the community (rather than species) level (Pijanowski et al., 2011; Farina, 2014). Preliminary work has attempted to make a case for community-level acoustic indices (e.g. Pieretti et al., 2011; Farina, 2014; Sueur et al., 2008); existing indices provide simple st...
2019
Killer whales (Orcinus orca) can produce 3 types of signals: clicks, whistles and vocalizations. This study focuses on Orca vocalizations from northern Vancouver Island (Hanson Island) where the NGO Orcalab developed a multi-hydrophone recording station to study Orcas. The acoustic station is composed of 5 hydrophones and extends over 50 km 2 of ocean. Since 2015 we are continuously streaming the hydrophone signals to our laboratory in Toulon, France, yielding nearly 50 TB of synchronous multichannel recordings. In previous work, we trained a Convolutional Neural Network (CNN) to detect Orca vocalizations, using transfer learning from a bird activity dataset. Here, for each detected vocalization, we estimate the pitch contour (fundamental frequency). Finally, we cluster vocalizations by features describing the pitch contour. While preliminary, our results demonstrate a possible route towards automatic Orca call type classification. Furthermore, they can be linked to the presence of ...