Enriched Multimodal Representations of Music Performances: Online Access and Visualization (original) (raw)

Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications

IEEE Transactions on Multimedia, 2019

We introduce a dataset for facilitating audiovisual analysis of music performances. The dataset comprises 44 simple multi-instrument classical music pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the audio recordings of the individual tracks, the audio and video recording of the assembled mixture, and ground-truth annotation files including frame-level and note-level transcriptions. We describe our methodology for the creation of the dataset, particularly highlighting our approaches for addressing the challenges involved in maintaining synchronization and expressiveness. We demonstrate the high quality of synchronization achieved with our proposed approach by comparing the dataset with existing widely-used music audio datasets. We anticipate that the dataset will be useful for the development and evaluation of existing music information retrieval (MIR) tasks, as well as for novel multi-modal tasks. We benchmark two existing MIR tasks (multi-pitch analysis and scoreinformed source separation) on the dataset and compare with other existing music audio datasets. Additionally, we consider two novel multi-modal MIR tasks (visually informed multi-pitch analysis and polyphonic vibrato analysis) enabled by the dataset and provide evaluation measures and baseline systems for future comparisons (from our recent work). Finally, we propose several emerging research directions that the dataset enables.

PROBADO MUSIC: A MULTIMODAL ONLINE MUSIC LIBRARY

After several years of research and development, PROBADO Music—a multimodal digital music library system—is now made available for the public. To allow access to anyone from anywhere, we have prepared a collection of public domain music material that is accessible through our system. Besides streaming and presenting digital music documents (scanned sheet music, audio recordings, and lyrics), PROBADO Music employs current techniques from the field of music information retrieval to offer enhanced browsing, navigation, and search func-tionalities. We strongly believe that such novel library systems will appeal to music-lovers and can support musicians , musicologists, and music teachers in their work.

Method Development for Multimodal Data Corpus Analysis of Expressive Instrumental Music Performance

Frontiers in Psychology, 2020

Musical performance is a multimodal experience, for performers and listeners alike. This paper reports on a pilot study which constitutes the first step toward a comprehensive approach to the experience of music as performed. We aim at bridging the gap between qualitative and quantitative approaches, by combining methods for data collection. The purpose is to build a data corpus containing multimodal measures linked to high-level subjective observations. This will allow for a systematic inclusion of the knowledge of music professionals in an analytic framework, which synthesizes methods across established research disciplines. We outline the methods we are currently developing for the creation of a multimodal data corpus dedicated to the analysis and exploration of instrumental music performance from the perspective of embodied music cognition. This will enable the study of the multiple facets of instrumental music performance in great detail, as well as lead to the development of m...

A Multimodal Way of Experiencing and Exploring Music

Significant digitization efforts have resulted in large multimodal music collections, which comprise music-related documents of various types and formats including text, symbolic data, audio, image, and video. The challenge is to organize, understand, and search musical content in a robust, efficient, and intelligent manner. Key issues concern the development of methods for analysing, correlating, and annotating the available multimodal material, thus identifying and establishing semantic relationships across various music representations and formats. Here, one important task is referred to as music synchronization, which aims at identifying and linking semantically corresponding events present in different versions of the same underlying musical work. In this paper, we give an introduction to music synchronization and show how synchronization techniques can be integrated into novel user interfaces that allow music lovers and researchers to access and explore music in all its different facets thus enhancing human involvement with music and deepening music understanding.

Towards multimodal capture, annotation and semantic retrieval from performing arts

Communications in Computer and Information Science, 2011

A well-annotated dance media is an essential part of a nation's identity, transcending cultural and language barriers. Many dance video archives suffer from tremendous problems concerning authoring and access, because of the multimodal nature of human communication and complex spatio-temporal relationships that exist between dancers. A multimodal dance document consists of video of dancers in space and time, their dance steps through gestures and emotions and accompanying song and music.This work presents the architecture of an annotation system capturing information directly through the use of sensors, comparing and interpreting them using a context and a user's model in order to annotate, index and access multimodal documents.

Multimodal Music Processing

From January 23 to January 28, 2011, the Dagstuhl Seminar 11041 “Multimodal Music Processing” was held at Schloss Dagstuhl – Leibniz Center for Informatics. During the seminar, we discussed various aspects of the automated processing of music-related documents. These documents may describe a musical work in different ways comprising visual representations (e. g., sheet music), symbolic representations (e. g., MIDI, tablatures, chords), acoustic representations (CD recordings), audio-visual representations (videos), or text-based metadata. In this report, we give an overview of the main contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then one finds a list of abstracts giving a more detailed overview of the participants’ contributions as well as of the ideas and results discussed in the group meetings and panels of our seminar.

Multimodal Music Processing (Dagstuhl Seminar 11041)

Dagstuhl Reports, 2011

From January 23 to January 28, 2011, the Dagstuhl Seminar 11041 ``Multimodal Music Processing'' was held at Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, we discussed various aspects of the automated processing of music-related documents. These documents may describe a musical work in different ways comprising visual representations (e.,g., sheet music), symbolic representations (e.,g., MIDI, tablatures, chords), acoustic representations (CD recordings), audio-visual representations (videos), or text-based metadata. In this report, we give an overview of the main contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then one finds a list of abstracts giving a more detailed overview of the participants' contributions as well as of the ideas and results discussed in the group meetings and panels of our seminar.

A Framework for Managing Multimodal Digitized Music Collections

Lecture Notes in Computer Science, 2008

In this paper, we present a framework for managing heterogeneous, multimodal digitized music collections containing visual music representations (scanned sheet music) as well as acoustic music material (audio recordings). As a first contribution, we propose a preprocessing workflow comprising feature extraction, audio indexing, and music synchronization (linking the visual with the acoustic data). Then, as a second contribution, we introduce novel user interfaces for multimodal music presentation, navigation, and content-based retrieval. In particular, our system offers high quality audio playback with time-synchronous display of the digitized sheet music. Furthermore, our system allows a user to select regions within the scanned pages of a musical score in order to search for musically similar sections within the audio documents. Our novel user interfaces and search functionalities will be integrated into the library service system of the Bavarian State Library as part of the Probado project. ⋆⋆ This work was supported in part by Deutsche Forschungsgemeinschaft (DFG) under grant 554975 (1) Oldenburg BIB48 OLof 01-02.

Anechoic audio and 3D-video content database of small ensemble performances for virtual concerts

2019

This paper presents the details related to the creation of a public database of anechoic audio and 3D-video recordings of several small music ensemble performances. Musical extracts range from baroque to jazz music. This work aims at extending the already available public databases of anechoic stimuli, providing the community with flexible audiovisual content for virtual acoustic simulations. For each piece of music, musicians were first close-mic recorded together to provide an audio performance reference. This recording was followed by individual instrument retake recordings, while listening to the reference recording, to achieve the best audio separation between instruments. In parallel, 3D-video content was recorded for each musician, employing a multiple Kinect 2 RGB-Depth sensors system, allowing for the generation and easy manipulation of 3D point-clouds. Details of the choice of musical pieces, recording procedure, and technical details on the system architecture including p...