This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

Last update: Dec 16, 2022

Related tags

Deep Learning 1D-StateSpace

Overview

Jump Reward Inference for 1D Music Rhythmic State Spaces

An implementation of the probablistic jump reward inference model for music rhythmic information retrieval using the proposed 1D state space.

This repository contains the source code and demo videos of a joint music rhythmic analyzer system using the 1D state space and jump reward technique proposed in ICASSP-2022. This implementation includes music beat, downbeat, tempo, and meter tracking jointly and in a causal fashion.

arXiv 2111.00704

The model first takes the waveform to the spectral domain and then feeds them into one of the pre-trained BeatNet models to obtain beat/downbeat activations. Finally, the activations are used in a jump-reward inference model to infer beats, downbeats, tempo, and meter.

System Input:

Raw audio waveform

System Output:

A vector including beats, downbeats, local tempo, and local meter columns, respectively and with the following shape: numpy_array(num_beats, 4).

Installation Command:

Approach #1: Installing binaries from the pypi website:

pip install jump-reward-inference

Approach #2: Installing directly from the Git repository:

pip install git+https://github.com/mjhydri/1D-StateSpace

Usage Example:

estimator = joint_inference(1, plot=True) 

output = estimator.process("music file directory")

Video Demos:

This section demonstrates the system performance for several music genres. Each demo comprises four plots that are described as follows:

The first plot: 1D state space for music beat and tempo tracking. Each bar represents the posterior probability of the corresponding state at each time frame.
The second plot: The jump-back reward vector for the corresponding beat states.
The third plot:1D state space for music downbeat and meter tracking.
The fourth plot: The jump-back reward vector for the corresponding downbeat states.

1: Music Genre: Pop

2: Music Genre: Country

3: Music Genre: Reggae

4: Music Genre: Blues

5: Music Genre: Classical

Demos Discussion:

1- As demo videos suggest, the system infers multiple music rhythmic parameters, including music beat, downbeat, tempo and meter jointly and in an online fashion using very compact 1D state spaces and jump back reward technique. The system works suitably for different music genres. However, the process is relatively more straightforward for some genres such as pop and country due to the rich percussive content, solid attacks, and simpler rhythmic structures. In contrast, it is more challenging for genres with poor percussive profile, longer attack times, and more complex rhythmic structures such as classical music.

2- Since both neural networks and inference models are designed for online/real-time applications, the causalilty constrains are applied and future data is not accessible. It makes the jumpback weigths weaker initially and become stronger over time.

3- Given longer listening time is required to infer higher hierarchies, i.e., downbeat and meter, within the very early few seconds, downbeat results are less confident than lower hierarchies, i.e., beat and tempo, however, they get accurate after observing a bar period.

Acknowledgement

This work has been partially supported by the National Science Foundation grant 1846184.

References:

M. Heydari, M. McCallum, A. Ehmann and Z. Duan, "A Novel 1D State Space for Efficient Music Rhythmic Analysis", In Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2022. #(Submitted)

M. Heydari, F. Cwitkowitz, and Z. Duan, “BeatNet:CRNN and particle filtering for online joint beat down-beat and meter tracking,” in Proc. of the 22th Intl. Conf.on Music Information Retrieval (ISMIR), 2021.

M. Heydari and Z. Duan, “Don’t Look Back: An online beat tracking method using RNN and enhanced particle filtering,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2021.

Comments

Tempo off by 5 consistently

Hi Mojtaba,

I was trying out your package but find that the reported tempo is off consistently by 5. The easiest test of this is to use 808kick120bpm.mp3 from the beatnet package, though I found the same thing with another music sample. Beatnet reports the. correct tempo.

Any idea what might cause this?

Best, Alex

opened by akhudek 0

Releases(v0.0.6)

v0.0.6(Nov 6, 2021)

This release includes the jump reward inference model on the proposed 1D state space for the joint beat, downbeat, tempo, and meter tracking using the BeatNet activations.
Source code(tar.gz)
Source code(zip)
jump_reward_inference-0.0.6.tar.gz(10.24 KB)
jump_reward_inference.zip(6.49 KB)

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Fine-Grained R2R Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation. C

34 Nov 15, 2022

Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

CMaskTrack R-CNN for OVIS This repo serves as the official code release of the CMaskTrack R-CNN model on the Occluded Video Instance Segmentation data

61 Nov 25, 2022

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

SYMPAIS: Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis Overview | Installation | Documentation | Examples | Notebo

4 Sep 13, 2022

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab

398 Dec 30, 2022

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

80 Dec 30, 2022

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

66 Dec 26, 2022

This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

Related tags

Overview

Jump Reward Inference for 1D Music Rhythmic State Spaces

System Input:

System Output:

Installation Command:

Usage Example:

Video Demos:

Demos Discussion:

Acknowledgement

References:

You might also like...

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Comments

Tempo off by 5 consistently

Releases(v0.0.6)

v0.0.6(Nov 6, 2021)

Owner

Mojtaba Heydari

Only works with the dashboard version / branch of jesse

WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

A smaller subset of 10 easily classified classes from Imagenet, and a little more French

A simple interface for editing natural photos with generative neural networks.

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning.

Official implementation for paper Render In-between: Motion Guided Video Synthesis for Action Interpolation

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Solving SMPL/MANO parameters from keypoint coordinates.

DM-ACME compatible implementation of the Arm26 environment from Mujoco

Code for the published paper : Learning to recognize rare traffic sign

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

N-Omniglot is a large neuromorphic few-shot learning dataset

Contains a bunch of different python programm tasks

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Systemic Evolutionary Chemical Space Exploration for Drug Discovery

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

Learning kernels to maximize the power of MMD tests