Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Last update: Jan 04, 2022

Overview

Memory-Efficient Multi-Level In-Situ Generation (MLG)

By Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen and David Z. Pan.

This repo is the official implementation of "Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation".

Introduction

MLG is a general and unified framework to trade expensive memory transactions with ultra-fast on-chip computations, directly translating to performance improvement. MLG explores the intrinsic correlations and bit-level redundancy within DNN kernels and propose a multi-level in situ generation mechanism with mixed-precision bases to achieve on-the-fly recovery of high-resolution parameters with minimum hardware overhead. MLG can boost the memory efficiency by 10-20× with comparable accuracy over four state-of-theart designs, when benchmarked on ResNet-18/DenseNet121/MobileNetV2/V3 with various tasks

We explore intra-kernel and cross-kernel correlation in the accuracy (blue curve) and memory compression ratio (black curve) space with ResNet18/CIFAR-10. Our method generalizes prior DSConv and Blueprint Conv with better efficiency-performance trade-off.

On CIFAR-10/100 and ResNet-18/DenseNet-121, we surpass prior low-rank methods with 10-20x less weight storage cost.

Dependencies

Python >= 3.6
pyutils >= 0.0.1. See pyutils for installation.
pytorch-onn >= 0.0.2. See pytorch-onn for installation.
Python libraries listed in requirements.txt
NVIDIA GPUs and CUDA >= 10.2

Structures

core/
- models/
  - layers/
    - mlg_conv2d and mlg_linear: MLG layer definition
  - resnet.py: MLG-based ResNet definition
  - model_base.py: base model definition with all model utilities
- builder.py: build training utilities
configs: YAML-based config files
scripts/: contains experiment scripts
train.py: training logic

Usage

Pretrain teacher model.
> python3 train.py configs/cifar10/resnet18/train/pretrain.yml
Train MLG-based student model with L2-norm-based projection, knowledge distillation, multi-level orthonormality regularization, (Bi, Bo, qb, qu, qv) = (2, 44, 3, 6, 3).
> python3 train.py configs/cifar10/resnet18/train/train.yml --teacher.checkpoint=path-to-teacher-ckpt --mlg.projection_alg=train --mlg.kd=1 --mlg.base_in=2 --mlg.base_out=44 --mlg.basis_bit=3 --mlg.coeff_in_bit=6 --mlg.coeff_out_bit=3 --criterion.ortho_weight_loss=0.05
Scripts for experiments are in ./scripts. For example, to run teacher model pretraining, you can write proper task setting in SCRIPT=scripts/cifar10/resnet18/pretrain.py and run
> python3 SCRIPT
To train ML-based student model with KD and projection, you can write proper task setting in SCRIPT=scripts/cifar10/resnet18/train.py (need to provide the pretrained teacher checkpoint) and run
> python3 SCRIPT

Citing Memory-Efficient Multi-Level In-Situ Generation (MLG)

@inproceedings{gu2021MLG,
  title={Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation},
  author={Jiaqi Gu and Hanqing Zhu and Chenghao Feng and Mingjie Liu and Zixuan Jiang and Ray T. Chen and David Z. Pan},
  journal={International Conference on Computer Vision (ICCV)},
  year={2021}
}

Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Related tags

Overview

Memory-Efficient Multi-Level In-Situ Generation (MLG)

Introduction

Dependencies

Structures

Usage

Citing Memory-Efficient Multi-Level In-Situ Generation (MLG)

Related Papers

Owner

Jiaqi Gu

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

TreeSubstitutionCipher - Encryption system based on trees and substitution

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.

The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

TensorFlow implementation of Deep Reinforcement Learning papers

Deep Learning tutorials in jupyter notebooks.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

Doods2 - API for detecting objects in images and video streams using Tensorflow

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking

RefineMask (CVPR 2021)

Official implementation of Densely connected normalizing flows

Depth image based mouse cursor visual haptic

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations

This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University.

Easily pull telemetry data and create beautiful visualizations for analysis.

Original code for "Zero-Shot Domain Adaptation with a Physics Prior"