GitHub - ldzhangyx/instruct-MusicGen: The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning". (original) (raw)
Description
This is the official repository for the paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning".
If there is any problem related to the code running, please open an issue and I will help you as mush as I can.
Official pretrained ckpt
To promote transparency and reproducibility in research, I have retrained a similar model using publicly available datasets after the internship. This model has been trained on public data and adheres to the same methodology described in the paper.
**Note that this is NOT the official ckpt and has NO relation with Sony. The performance is similar to the official checkpoint. **
https://huggingface.co/ldzhangyx/instruct-MusicGen/blob/main/finetuned.ckpt
Demo page
https://bit.ly/instruct-musicgen
Installation
Pip
clone project
git clone https://github.com/ldzhangyx/instruct-MusicGen/ cd instruct-MusicGen
[OPTIONAL] create conda environment
conda create -n myenv python=3.11.7 conda activate myenv
install pytorch according to instructions
https://pytorch.org/get-started/
install requirements
pip install -r requirements.txt
Conda
clone project
git clone https://github.com/ldzhangyx/instruct-MusicGen/ cd instruct-MusicGen
create conda environment and install dependencies
conda env create -f environment.yaml -n myenv
activate conda environment
conda activate myenv
How to run
Train model with default configuration
train on CPU
python src/train.py trainer=cpu
train on GPU
python src/train.py trainer=gpu
You may need to change essential parameters in config/config.yaml
to fit your own dataset.
You can override any parameter from command line like this
python src/train.py trainer.max_epochs=50 data.batch_size=4
Evaluation
Step 1: Generate evaluation datasets
python src/data/slakh_datamodule.py
Step 2: Generate music files
For add
, remove
, extract
operation, please change the parameters in both test_step()
in src/models/instructmusicgenadapter_module.py
and __getitem__()
in src/data/slakh_datamodule.py
.
Currently it should be completed manually. But we will provide a script to automate this process soon.
Step 3: Evaluate
Please make sure the generated music files are in the corresponding locations.
python evaluation/utils.py # to generate a csv file for CLAP calculation python evaluation/main.py
Inference script
After preparing the checkpoint and the input audio file, you can generate audio via
Citation
@article{zhang2024instruct,
title={Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning},
author={Zhang, Yixiao and Ikemiya, Yukara and Choi, Woosung and Murata, Naoki and Mart{\'\i}nez-Ram{\'\i}rez, Marco A and Lin, Liwei and Xia, Gus and Liao, Wei-Hsiang and Mitsufuji, Yuki and Dixon, Simon},
journal={arXiv preprint arXiv:2405.18386},
year={2024}
}