The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Related tags

Deep LearningSF-Net
Overview

SF-Net for fullband SE

This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement", which is submitted to Interspecch 2022. Some audio samples are provided here and the code for GCRN-full, DS-Net-full, CTS-Net-full and the network configuration of SF-Net are released.

Abstract:Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

Demo page of audio samples

(https://yuguochencuc.github.io/sfnet_demo/)

System flowchart of SF-Net

image

Results:

Abaltion study

lQLPDhtH56hfHqXNAjHNA7OwrtrfF1S_QEECRntyY8DWAA_947_561

Comparison with SOTA

lQLPDhtH6ITooSDNAz7NA7GwySl8YbYqe-8CRnzbbEA6AA_945_830

Visualization of spectrograms

VB dataset

lQLPDhtH6NmMG6rNAwTNA6uw8qUBkZFtUxgCRn1lwQBsAA_939_772

DNS blind set

lQLPDhtH8iT2-N_NAubNA42wBREh0WKHv5wCRoygD0BOAA_909_742

Owner
Guochen Yu
Phd student at Communication University of China and Institute of Acoustics, Chinese Academy of Sciences
Guochen Yu
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
Official implementation of Long-Short Transformer in PyTorch.

Long-Short Transformer (Transformer-LS) This repository hosts the code and models for the paper: Long-Short Transformer: Efficient Transformers for La

NVIDIA Corporation 198 Dec 29, 2022
lightweight python wrapper for vowpal wabbit

vowpal_porpoise Lightweight python wrapper for vowpal_wabbit. Why: Scalable, blazingly fast machine learning. Install Install vowpal_wabbit. Clone and

Joseph Reisinger 163 Nov 24, 2022
FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks

FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks Image Classification Dataset: Google Landmark, COCO, ImageNet Model: Efficient

FedML-AI 62 Dec 10, 2022
Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the Machine Learning 4 Health Workshop

Detection-aided liver lesion segmentation Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the

Image Processing Group - BarcelonaTECH - UPC 96 Oct 26, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022
PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).

PFENet This is the implementation of our paper PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation that has been accepted to IEE

DV Lab 230 Dec 31, 2022
The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

Yufei Wang 176 Jan 06, 2023
Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

Invariant Point Attention - Pytorch Implementation of Invariant Point Attention as a standalone module, which was used in the structure module of Alph

Phil Wang 113 Jan 05, 2023
Byzantine-robust decentralized learning via self-centered clipping

Byzantine-robust decentralized learning via self-centered clipping In this paper, we study the challenging task of Byzantine-robust decentralized trai

EPFL Machine Learning and Optimization Laboratory 4 Aug 27, 2022
Groceries ARL: Association Rules (Birliktelik Kuralı)

Groceries_ARL Association Rules (Birliktelik Kuralı) Birliktelik kuralları, mark

Şebnem 5 Feb 08, 2022
CVPR2021 Workshop - HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization.

HDRUNet [Paper Link] HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization By Xiangyu Chen, Yihao Liu, Zhengwen Zhang, Yu Qiao an

XyChen 105 Dec 20, 2022
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

CALVIN CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks Oier Mees, Lukas Hermann, Erick Rosete,

Oier Mees 107 Dec 26, 2022
Example of a Quantum LSTM

Example of a Quantum LSTM

Riccardo Di Sipio 36 Oct 31, 2022
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

Zihao Fu 37 Nov 21, 2022
Voice control for Garry's Mod

WIP: Talonvoice GMod integrations Very work in progress voice control demo for Garry's Mod. HOWTO Install https://talonvoice.com/ Press https://i.imgu

Meta Construct 5 Nov 15, 2022
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.

Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi

Simon Jenni 46 Nov 14, 2022
Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

Kartikay Garg 669 Dec 29, 2022