Modeling Temporal Tonal Relations in Polyphonic Music Through Deep Networks with a Novel Image-Based Representation (original) (raw)

Induction and recapitulation of deep musical structure

… of International Joint Conference on Artificial …, 1995

We describe recent extensions to our framework for the automatic generation of music-making programs. We have previously used genetic programming techniques to produce musicmaking programs that satisfy user-provided critical criteria. In this paper we describe new work on the use of connectionist techniques to automatically induce musical structure from a corpus. We show h o w the resulting neural networks can be used as critics that drive our genetic programming system. We argue that this framework can potentially support the induction and recapitulation of deep structural features of music. We present some initial results produced using neural and hybrid symbolic neural critics, and we discuss directions for future work.

The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

2019 International Workshop on Multilayer Music Representation and Processing (MMRP), 2019

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.

Deep Music Information Dynamics

arXiv (Cornell University), 2021

Music comprises of a set of complex simultaneous events organized in time. In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams-a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself. Motivated by rate-distortion theories of human cognition we propose a framework for exploring possible relations between imaginary anticipations existing in the listener's mind and information dynamics of the musical surface itself. This model is demonstrated for the case of symbolic (MIDI) data, as accounting for acoustic surface would require many more layers to capture instrument properties and performance expressive inflections. The mathematical framework is based on variational encoding that first establishes a high rate representation of the musical observations, which is then reduced using a bit-allocation method into a parallel low rate data stream. The combined loss considered here includes both the information rate in terms of time evolution for each stream, and the fidelity of encoding measured in terms of mutual information between the high and low rate representations. In the simulations presented in the paper we are able to juxtapose aspects of latent/imaginary surprisal versus surprisal of the music surface in a manner that is quantifiable and computationally tractable. The set of computational tools is discussed in the paper, suggesting that a trade off between compression and prediction are an important factor in the analysis and design of time-based music generative models.

Musical Deep Learning: Stylistic Melodic Generation with Complexity Based Similarity

The wide-ranging impact of deep learning models implies significant application in music analysis, retrieval , and generation. Initial findings from musical application of a conditional restricted Boltzmann machine (CRBM) show promise towards informing creative computation. Taking advantage of the CRBM's ability to model temporal dependencies full reconstructions of pieces are achievable given a few starting seed notes. The generation of new material using figuration from the training corpus requires restrictions on the size and memory space of the CRBM, forcing associative rather than perfect recall. Musical analysis and information complexity measures show the musical encoding to be the primary determinant of the nature of the generated results.

A Comparative Study of Neural Models for Polyphonic Music Sequence Transduction

2019

Automatic transcription of polyphonic music remains a challenging task in the field of Music Information Retrieval. One under-investigated point is the post-processing of time-pitch posteriograms into binary piano rolls. In this study, we investigate this task using a variety of neural network models and training procedures. We introduce an adversarial framework, that we compare against more traditional training losses. We also propose the use of binary neuron outputs and compare them to the usual real-valued outputs in both training frameworks. This allows us to train networks directly using the F-measure as training objective. We evaluate these methods using two kinds of transduction networks and two different multi-pitch detection systems, and compare the results against baseline note-tracking methods on a dataset of classical piano music. Analysis of results indicates that (1) convolutional models improve results over baseline models, but no improvement is reported for recurrent...

Deep Learning Music Generation

2019

We are interested in using deep learning models to generate new music. Using the 1 Maestro Dataset, we will use an LSTM architecture that inputs tokenized Midi 2 files and outputs predictions for note. Our accuracy will be measured by taking 3 predicted noted and comparing those to ground truths. Using A.I. for music is a 4 relatively new area of study, and this project provides an investigation into creating 5 an effect model for the music industry. 6

Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music

ArXiv, 2020

Automatic generation of sequences has been a highly explored field in the last years. In particular, natural language processing and automatic music composition have gained importance due to the recent advances in machine learning and Neural Networks with intrinsic memory mechanisms such as Recurrent Neural Networks. This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition. The proposed approach considers music theory concepts such as transposition, and uses data transformations (embeddings) to introduce semantic meaning and improve the quality of the generated melodies. A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically, measuring the tonality of the musical compositions.

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription

arXiv preprint arXiv:1206.6392, 2012

Abstract: We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to improve the accuracy of ...

Music transcription modelling and composition using deep learning

ArXiv, 2016

We apply deep learning methods, specifically long short-term memory (LSTM) networks, to music transcription modelling and composition. We build and train LSTM networks using approximately 23,000 music transcriptions expressed with a high-level vocabulary (ABC notation), and use them to generate new transcriptions. Our practical aim is to create music transcription models useful in particular contexts of music composition. We present results from three perspectives: 1) at the population level, comparing descriptive statistics of the set of training transcriptions and generated transcriptions; 2) at the individual level, examining how a generated transcription reflects the conventions of a music practice in the training transcriptions (Celtic folk); 3) at the application level, using the system for idea generation in music composition. We make our datasets, software and sound examples open and available: \url{this https URL}.

Music Generation Using Deep Learning

2021

This paper explores the idea of utilising Long Short-Term Memory neural networks (LSTMNN) for the generation of musical sequences in ABC notation. The proposed approach takes ABC notations from the Nottingham dataset and encodes it to beefed as input for the neural networks. The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence based on the note until a good piece of music is produced. Multiple tunings have been done to amend the parameters of the network for optimal generation. The output is assessed on the basis of rhythm, harmony, and grammar accuracy