AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Overview

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP]

Unofficial Pytorch implementation of AdaSpeech 2.

Requirements :

All code written in Python 3.6.2 .

  • Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

  • Installing other requirements :
pip install -r requirements.txt
  • To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

Training :

[WIP]

Citations :

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}
@misc{yan2021adaspeech,
      title={AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data}, 
      author={Yuzi Yan and Xu Tan and Bohan Li and Tao Qin and Sheng Zhao and Yuan Shen and Tie-Yan Liu},
      year={2021},
      eprint={2104.09715},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}
Owner
Rishikesh (ऋषिकेश)
Deep Learning/ AI Researcher | Open Source enthusiast | Text to Speech | Speech Synthesis | Generative Models | Object detection | Language Understanding
Rishikesh (ऋषिकेश)
PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

MAE for Self-supervised ViT Introduction This is an unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-sup

36 Oct 30, 2022
U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

U-Net for brain segmentation U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI based on a deep learning segmentation alg

562 Jan 02, 2023
STBP is a way to train SNN with datasets by Backward propagation.

Spiking neural network (SNN), compared with depth neural network (DNN), has faster processing speed, lower energy consumption and more biological interpretability, which is expected to approach Stron

Ling Zhang 18 Dec 09, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction Project Page | Paper | Supplementary | Video This reposit

331 Dec 28, 2022
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"

CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c

201 Dec 29, 2022
Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

66 Dec 16, 2022
Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Context-Aware-Healthcare Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs Download

LuChang 9 Dec 26, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 111 Dec 18, 2022
Computer Vision and Pattern Recognition, NUS CS4243, 2022

CS4243_2022 Computer Vision and Pattern Recognition, NUS CS4243, 2022 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : h

Xavier Bresson 142 Dec 15, 2022
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 126 Jan 06, 2023
PyTorch implementation for Graph Contrastive Learning with Augmentations

Graph Contrastive Learning with Augmentations PyTorch implementation for Graph Contrastive Learning with Augmentations [poster] [appendix] Yuning You*

Shen Lab at Texas A&M University 382 Dec 15, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D) Code & Data Appendix for Conjugated Discrete Distributions for Distr

1 Jan 11, 2022
The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"

Prior-Enhanced network with Meta-Prototypes (PEMP) This is the PyTorch implementation of PEMP. Overview of PEMP Meta-Prototypes & Adaptive Prototypes

Jianwei ZHANG 8 Oct 14, 2021
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
This is the pytorch re-implementation of the IterNorm

IterNorm-pytorch Pytorch reimplementation of the IterNorm methods, which is described in the following paper: Iterative Normalization: Beyond Standard

Lei Huang 32 Dec 27, 2022
Advances in Neural Information Processing Systems (NeurIPS), 2020.

What is being transferred in transfer learning? This repo contains the code for the following paper: Behnam Neyshabur*, Hanie Sedghi*, Chiyuan Zhang*.

Google Research 36 Aug 26, 2022
Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

FAU Implementation of the paper: Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution. Yingruo

Evelyn 78 Nov 29, 2022
Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Lightweight (Bayesian) Media Mix Model This is not an official Google product. L

Google 342 Jan 03, 2023