Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Last update: Dec 26, 2022

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

CVPR 2021 (Oral)

Existing segmentation methods are mostly unidirectional, i.e. utilizing 3D for 2D segmentation or vice versa. Obviously 2D and 3D information can nicely complement each other in both directions, during the segmentation. This is the goal of bidirectional projection network.

Environment

Main

# Torch
$ pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
# MinkowskiEngine 0.4.1
$ conda install numpy openblas
$ git clone https://github.com/StanfordVL/MinkowskiEngine.git
$ cd MinkowskiEngine
$ git checkout f1a419cc5792562a06df9e1da686b7ce8f3bb5ad
$ python setup.py install
# Others
$ pip install imageio==2.8.0 opencv-python==4.2.0.32 pillow==7.0.0 pyyaml==5.3 scipy==1.4.1 sharedarray==3.2.0 tensorboardx==2.0 tqdm==4.42.1

Others

Please refer to env.yml for details.

Prepare data

Download the dataset from official website.
2D: The scripts is from 3DMV repo, it is based on python2, other code in this repo is based on python3 python prepare_2d_data.py --scannet_path data/scannetv2 --output_path data/scannetv2_images --export_label_images
3D: dataset/preprocess_3d_scannet.py

Config

BPNet_5cm: config/scannet/bpnet_5cm.yaml

Training

Download pretrained 2D ResNets on ImageNet from PyTorch website, and put them into the initmodel folder.

model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}

Start training: sh tool/train.sh EXP_NAME /PATH/TO/CONFIG NUMBER_OF_THREADS
Resume: sh tool/resume.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS

NUMBER_OF_THREADS is the threads to use per process (gpu), so optimally, it should be Total_threads / gpu_number_used

Testing

Testing using your trained model or our pre-trained model (voxel_size: 5cm): sh tool/test.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS)

Copyright and License

You are granted with the LICENSE for both academic and commercial usages.

Acknowledgment

Our code is based on MinkowskiEngine. We also referred to SparseConvNet and semseg.

Citation

@inproceedings{hu-2021-bidirectional,
        author      = {Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong},
        title       = {Bidirectional Projection Network for Cross Dimensional Scene Understanding},
        booktitle   = {CVPR},
        year        = {2021}
    }

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Related tags

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

Environment

Prepare data

Config

Training

Testing

Copyright and License

Acknowledgment

Citation

Owner

Hu Wenbo

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

Deploy a ML inference service on a budget in less than 10 lines of code.

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Multi-agent reinforcement learning algorithm and environment

Yolo object detection - Yolo object detection with python

Categorizing comments on YouTube into different categories.

A lightweight python AUTOmatic-arRAY library.

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

An OpenAI Gym environment for Super Mario Bros

Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

FS2KToolbox FS2K Dataset Towards the translation between Face

A Lightweight Experiment & Resource Monitoring Tool 📺

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.