Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Last update: Dec 27, 2022

Related tags

Deep Learning SparseR-CNN

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Updates

(02/03/2021) Higher performance is reported by using stronger backbone model PVT.
(23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
(02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
(26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
(26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.

Models

Method	inf_time	train_time	box AP	download
R50_100pro_3x	23 FPS	19h	42.8	model \| log
R50_300pro_3x	22 FPS	24h	45.0	model \| log
R101_100pro_3x	19 FPS	25h	44.1	model \| log
R101_300pro_3x	18 FPS	29h	46.4	model \| log

Models and logs are available in Baidu Drive by code wt9n.

Notes

We observe about 0.3 AP noise.
The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.

Method	inf_time	train_time	box AP	codebase
R50_300pro_3x	22 FPS	24h	45.0	detectron2
R50_300pro_3x.detco	22 FPS	28h	46.5	detectron2
PVTSmall_300pro_3x	13 FPS	50h	45.7	mmdetection
PVTv2-b2_300pro_3x	11 FPS	76h	50.1	mmdetection

Installation

The codebases are built on top of Detectron2 and DETR.

Requirements

Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop

Link coco dataset path to SparseR-CNN/datasets/coco

mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

Train SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml

Evaluate SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth

Visualize SparseR-CNN

python demo/demo.py\
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
    --opts MODEL.WEIGHTS path/to/model.pth

Third-party resources

mmdetection implementation: sparse_rcnn. Thank Shilong Zhang!
cvpod implementation:sparse_rcnn. Thank Benjin Zhu!
paddledetection implementation:sparse_rcnn. Thank FL77N!

License

SparseR-CNN is released under MIT License.

Citing

If you use SparseR-CNN in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{peize2020sparse,
  title   =  {{SparseR-CNN}: End-to-End Object Detection with Learnable Proposals},
  author  =  {Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv:2011.12450},
  year    =  {2020}
}

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Related tags

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Updates

Models

Notes

Installation

Requirements

Steps

Third-party resources

License

Citing

Owner

Peize Sun

implementation for paper "ShelfNet for fast semantic segmentation"

A Runtime method overload decorator which should behave like a compiled language

Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

ISNAS-DIP: Image Specific Neural Architecture Search for Deep Image Prior [CVPR 2022]

Deep-learning X-Ray Micro-CT image enhancement, pore-network modelling and continuum modelling

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

MMRazor: a model compression toolkit for model slimming and AutoML

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

Continuous Conditional Random Field Convolution for Point Cloud Segmentation

PyTorch implementation of "VRT: A Video Restoration Transformer"

Learn about Spice.ai with in-depth samples

Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Denoising Diffusion Probabilistic Models

A pytorch &keras implementation and demo of Fastformer.

A new data augmentation method for extreme lighting conditions.

Distributed Evolutionary Algorithms in Python