Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Last update: Dec 15, 2022

Overview

NonCuboidRoom

Paper

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Cheng Yang*, Jia Zheng*, Xili Dai, Rui Tang, Yi Ma, Xiaojun Yuan.

[Preprint] [Supplementary Material]

(*: Equal contribution)

Installation

The code is tested with Ubuntu 16.04, PyTorch v1.5, CUDA 10.1 and cuDNN v7.6.

# create conda env
conda create -n layout python=3.6
# activate conda env
conda activate layout
# install pytorch
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
# install dependencies
pip install -r requirements.txt

Data Preparation

Structured3D Dataset

Please download Structured3D dataset and our processed 2D line annotations. The directory structure should look like:

data
└── Structured3D
    │── Structured3D
    │   ├── scene_00000
    │   ├── scene_00001
    │   ├── scene_00002
    │   └── ...
    └── line_annotations.json

SUN RGB-D Dataset

Please download SUN RGB-D dataset, our processed 2D line annotation for SUN RGB-D dataset, and layout annotations of NYUv2 303 dataset. The directory structure should look like:

data
└── SUNRGBD
    │── SUNRGBD
    │    ├── kv1
    │    ├── kv2
    │    ├── realsense
    │    └── xtion
    │── sunrgbd_train.json      // our extracted 2D line annotations of SUN RGB-D train set
    │── sunrgbd_test.json       // our extracted 2D line annotations of SUN RGB-D test set
    └── nyu303_layout_test.npz  // 2D ground truth layout annotations provided by NYUv2 303 dataset

Pre-trained Models

You can download our pre-trained models here:

The model trained on Structured3D dataset.
The model trained on SUN RGB-D dataset and NYUv2 303 dataset.

Structured3D Dataset

To train the model on the Structured3D dataset, run this command:

python train.py --model_name s3d --data Structured3D

To evaluate the model on the Structured3D dataset, run this command:

python test.py --pretrained DIR --data Structured3D

NYUv2 303 Dataset

To train the model on the SUN RGB-D dataset and NYUv2 303 dataset, run this command:

# first fine-tune the model on the SUN RGB-D dataset
python train.py --model_name sunrgbd --data SUNRGBD --pretrained Structure3D_DIR --split all --lr_step []
# Then fine-tune the model on the NYUv2 subset
python train.py --model_name nyu --data SUNRGBD --pretrained SUNRGBD_DIR --split nyu --lr_step [] --epochs 10

To evaluate the model on the NYUv2 303 dataset, run this command:

python test.py --pretrained DIR --data NYU303

Inference on the customized data

To predict the results of customized images, run this command:

python test.py --pretrained DIR --data CUSTOM

Citation

@article{NonCuboidRoom,
  title   = {Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image},
  author  = {Cheng Yang and
             Jia Zheng and
             Xili Dai and
             Rui Tang and
             Yi Ma and
             Xiaojun Yuan},
  journal = {CoRR},
  volume  = {abs/2104.07986},
  year    = {2021}
}

LICENSE

The code is released under the MIT license. Portions of the code are borrowed from HRNet-Object-Detection and CenterNet.

Acknowledgements

We would like to thank Lei Jin for providing us the code for parsing the layout annotations in SUN RGB-D dataset.

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Related tags

Overview

NonCuboidRoom

Paper

Installation

Data Preparation

Structured3D Dataset

SUN RGB-D Dataset

Pre-trained Models

Structured3D Dataset

NYUv2 303 Dataset

Inference on the customized data

Citation

LICENSE

Acknowledgements

Owner

HyperaPy: An automatic hyperparameter optimization framework ⚡🚀

DRLib：A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Bald-to-Hairy Translation Using CycleGAN

Learning To Have An Ear For Face Super-Resolution

A repo with study material, exercises, examples, etc for Devnet SPAUTO

Implementation of PyTorch-based multi-task pre-trained models

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

This is the winning solution of the Endocv-2021 grand challange.

This repository contains small projects related to Neural Networks and Deep Learning in general.

Code repository for the paper Computer Vision User Entity Behavior Analytics

Wide Residual Networks (WideResNets) in PyTorch

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

[ WSDM '22 ] On Sampling Collaborative Filtering Datasets

This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"