An AI for Music Generation

Overview

MuseGAN

MuseGAN is a project on music generation. In a nutshell, we aim to generate polyphonic music of multiple tracks (instruments). The proposed models are able to generate music either from scratch, or by accompanying a track given a priori by the user.

We train the model with training data collected from Lakh Pianoroll Dataset to generate pop song phrases consisting of bass, drums, guitar, piano and strings tracks.

Sample results are available here.

Looking for a PyTorch version? Check out this repository.

Prerequisites

Below we assume the working directory is the repository root.

Install dependencies

  • Using pipenv (recommended)

    Make sure pipenv is installed. (If not, simply run pip install pipenv.)

    # Install the dependencies
    pipenv install
    # Activate the virtual environment
    pipenv shell
  • Using pip

    # Install the dependencies
    pip install -r requirements.txt

Prepare training data

The training data is collected from Lakh Pianoroll Dataset (LPD), a new multitrack pianoroll dataset.

# Download the training data
./scripts/download_data.sh
# Store the training data to shared memory
./scripts/process_data.sh

You can also download the training data manually (train_x_lpd_5_phr.npz).

As pianoroll matrices are generally sparse, we store only the indices of nonzero elements and the array shape into a npz file to save space, and later restore the original array. To save some training data data into this format, simply run np.savez_compressed("data.npz", shape=data.shape, nonzero=data.nonzero())

Scripts

We provide several shell scripts for easy managing the experiments. (See here for a detailed documentation.)

Below we assume the working directory is the repository root.

Train a new model

  1. Run the following command to set up a new experiment with default settings.

    # Set up a new experiment
    ./scripts/setup_exp.sh "./exp/my_experiment/" "Some notes on my experiment"
  2. Modify the configuration and model parameter files for experimental settings.

  3. You can either train the model:

    # Train the model
    ./scripts/run_train.sh "./exp/my_experiment/" "0"

    or run the experiment (training + inference + interpolation):

    # Run the experiment
    ./scripts/run_exp.sh "./exp/my_experiment/" "0"

Collect training data

Run the following command to collect training data from MIDI files.

# Collect training data
./scripts/collect_data.sh "./midi_dir/" "data/train.npy"

Use pretrained models

  1. Download pretrained models

    # Download the pretrained models
    ./scripts/download_models.sh

    You can also download the pretrained models manually (pretrained_models.tar.gz).

  2. You can either perform inference from a trained model:

    # Run inference from a pretrained model
    ./scripts/run_inference.sh "./exp/default/" "0"

    or perform interpolation from a trained model:

    # Run interpolation from a pretrained model
    ./scripts/run_interpolation.sh "./exp/default/" "0"

Outputs

By default, samples will be generated alongside the training. You can disable this behavior by setting save_samples_steps to zero in the configuration file (config.yaml). The generated will be stored in the following three formats by default.

  • .npy: raw numpy arrays
  • .png: image files
  • .npz: multitrack pianoroll files that can be loaded by the Pypianoroll package

You can disable saving in a specific format by setting save_array_samples, save_image_samples and save_pianoroll_samples to False in the configuration file.

The generated pianorolls are stored in .npz format to save space and processing time. You can use the following code to write them into MIDI files.

from pypianoroll import Multitrack

m = Multitrack('./test.npz')
m.write('./test.mid')

Sample Results

Some sample results can be found in ./exp/ directory. More samples can be downloaded from the following links.

Papers

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Hao-Wen Dong and Yi-Hsuan Yang
in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
[website] [arxiv] [paper] [slides(long)] [slides(short)] [poster] [code]

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang and Yi-Hsuan Yang, (*equal contribution)
in Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018.
[website] [arxiv] [paper] [slides] [code]

MuseGAN: Demonstration of a Convolutional GAN Based Model for Generating Multi-track Piano-rolls
Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang and Yi-Hsuan Yang (*equal contribution)
in Late-Breaking Demos of the 18th International Society for Music Information Retrieval Conference (ISMIR), 2017. (two-page extended abstract)
[paper] [poster]

Owner
Hao-Wen Dong
PhD Candidate in Computer Science at UC San Diego | Previous Intern at Dolby and Yamaha | Music x AI
Hao-Wen Dong
Python game programming in Jupyter notebooks.

Jupylet Jupylet is a Python library for programming 2D and 3D games, graphics, music and sound synthesizers, interactively in a Jupyter notebook. It i

Nir Aides 178 Dec 09, 2022
Voice helper on russian

Voice helper on russian

KreO 1 Jun 30, 2022
An AI for Music Generation

An AI for Music Generation

Hao-Wen Dong 1.3k Dec 31, 2022
Suyash More 111 Jan 07, 2023
Users can transcribe their favorite piano recordings to MIDI files after installation

Users can transcribe their favorite piano recordings to MIDI files after installation

190 Dec 17, 2022
A tool for retrieving audio in the past

Rewinder A tool for retrieving audio in the past. Ever felt like, I need to remember that discussion which happened 10 min back. Now you can! Rewind a

Bharat 1 Jan 24, 2022
Nayeli: cool telegram groups vc music project

Nayeli-music Nayeli 🥀 is cool telegram 🍎 groups vc music project 🎋 . Nayeli-music Nayeli Deployment 🎋 📲 Esy deploy 🐾️ Source Owner ♥️ ❄️ He is s

Kasun bandara 2 Dec 20, 2021
Mousai is a simple application that can identify song like Shazam

Mousai is a simple application that can identify song like Shazam. It saves the artist, album, and title of the identified song in a JSON file.

Dave Patrick 662 Jan 07, 2023
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
a library for audio and music analysis

aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh

aubio 2.9k Dec 30, 2022
Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.

PlexMusicSync Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists (m3u, m3u8) with a Plex server. The song file

Tom Goetz 9 Jul 07, 2022
Telegram Bot to play music in VoiceChat with Channel Support and autostarts Radio.

VCPlayerBot Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live streams, YouTube videos and telegram media

Abdisamad Omar Mohamed 1 Oct 15, 2021
Spotipy - Player de música simples em Python

Spotipy Player de música simples em Python, utilizando a biblioteca Pysimplegui para a interface gráfica. Este tocador é bastante simples em si, mas p

Adelino Almeida 4 Feb 28, 2022
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
A GUI-based audio player with support for a large variety of formats

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ

Thomas Xin 3 Dec 14, 2022
❤️ This Is The EzilaXMusicPlayer Advaced Repo 🎵

Telegram EzilaXMusicPlayer Bot 🎵 A bot that can play music on telegram group's voice Chat ❤️ Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+

Sadew Jayasekara 11 Nov 12, 2022
A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

5 Oct 07, 2022
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Auditory Slow-Fast This repository implements the model proposed in the paper: Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen, Slow-Fa

Evangelos Kazakos 57 Dec 07, 2022
Jarvis From Basic to Advance - make a voice assistant similar to JARVIS (in iron man movie)

JARVIS (Basic to Advance) This was my attempt to make a voice assistant similar to JARVIS (in iron man movie) Let's be honest, it's not as intelligent

codesempai 17 Dec 25, 2022