[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

Overview

DeepDeform (CVPR'2020)

DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow images and 4,479 foreground object masks. We also provide 149,228 sparse match annotations and 63,512 occlusion point annotations.

Download Data

If you would like to download the DeepDeform data, please fill out this google form and, once accepted, we will send you the link to download the data.

Online Benchmark

If you want to participate in the benchmark(s), you can submit your results at DeepDeform Benchmark website.

Currently we provide benchmarks for the following tasks:

By uploading your results on the test set to the DeepDeform Benchmark website the performance of you method is automatically evaluated on the hidden test labels, and compared to other already evaluated methods. You can decide if you want to make the evaluation results public or not.

If you want to evaluate on validation set, we provide code that is used for evaluation of specific benchmarks in directory evaluation/. To evaluate optical flow or non-rigid reconstruction, you need to adapt FLOW_RESULTS_DIR or RECONSTRUCTION_RESULTS_DIR in config.py to correspond to your results directory (that would be in the same format as for the online submission, described here).

In order to evaluate reconstruction, you need to compile additional C++ modules.

  • Install necessary dependencies:
pip install pybind11
pip install Pillow
pip install plyfile
pip install tqdm
pip install scikit-image
  • Inside the evaluation/csrc adapt includes.py to point to your Eigen include directory.

  • Compile the code by executing the following in evaluation/csrc:

python setup.py install

Data Organization

Data is organized into 3 subsets, train, val, and test directories, using 340-30-30 sequence split. In every subset each RGB-D sequence is stored in a directory <sequence_id>, which follows the following format:

<sequence_id>
|-- <color>: color images for every frame (`%06d.jpg`)
|-- <depth>: depth images for every frame (`%06d.png`)
|-- <mask>: mask images for a few frames (`%06d.png`)
|-- <optical_flow>: optical flow images for a few frame pairs (`<object_id>_<source_id>_<target_id>.oflow` or `%s_%06d_%06d.oflow`)
|-- <scene_flow>: scene flow images for a few frame pairs (`<object_id>_<source_id>_<target_id>.sflow` or `%s_%06d_%06d.sflow`)
|-- <intrinsics.txt>: 4x4 intrinsics matrix

All labels are provided in .json files in root dataset r directory:

  • train_matches.json and val_matches.json:
    Manually annotated sparse matches.
  • train_dense.json and val_dense.json:
    Densely aligned optical and scene flow images with the use of sparse matches as a guidance.
  • train_selfsupervised.json and val_selfsupervised.json:
    Densely aligned optical and scene flow images using self-supervision (DynamicFusion pipeline) for a few sequences. - train_selfsupervised.json and `val_skaldir
  • train_masks.json and val_masks.json:
    Dynamic object annotations for a few frames per sequence.
  • train_occlusions.json and val_occlusions.json:
    Manually annotated sparse occlusions.

Data Formats

We recommend you to test out scripts in demo/ directory in order to check out loading of different file types.

RGB-D Data: 3D data is provided as RGB-D video sequences, where color and depth images are already aligned. Color images are provided as 8-bit RGB .jpg, and depth images as 16-bit .png (divide by 1000 to obtain depth in meters).

Camera Parameters: A 4x4 intrinsic matrix is given for every sequence (because different cameras were used for data capture, every sequence can have different intrinsic matrix). Since the color and depth images are aligned, no extrinsic transformation is necessary.

Optical Flow Data: Dense optical flow data is provided as custom binary image of resolution 640x480 with extension .oflow. Every pixel contains two values for flow in x and y direction, in pixels. Helper function to load/store binary flow images is provided in utils.py.

Scene Flow Data: Dense scene flow data is provided as custom binary image of resolution 640x480 with extension .sflow. Every pixel contains 3 values for flow in x, y and z direction, in meters. Helper function to load/store binary flow images is provided in utils.py.

Object Mask Data: A few frames per sequences also include foreground dynamic object annotation. The mask image is given as 16-bit .png image (1 for object, 0 for background).

Sparse Match Annotations: We provide manual sparse match annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of source and target pixels.

Sparse Occlusion Annotations: We provide manual sparse occlusion annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of occluded pixels in source frame.

Citation

If you use DeepDeform data or code please cite:

@inproceedings{bozic2020deepdeform, 
    title={DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data}, 
    author={Bo{\v{z}}i{\v{c}}, Alja{\v{z}} and Zollh{\"o}fer, Michael and Theobalt, Christian and Nie{\ss}ner, Matthias}, 
    journal={Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2020}
}

Help

If you have any questions, please contact us at [email protected], or open an issue at Github.

License

The data is released under DeepDeform Terms of Use, and the code is release under a non-comercial creative commons license.

Owner
Aljaz Bozic
PhD Student at Visual Computing Group
Aljaz Bozic
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 03, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
Neurolab is a simple and powerful Neural Network Library for Python

Neurolab Neurolab is a simple and powerful Neural Network Library for Python. Contains based neural networks, train algorithms and flexible framework

152 Dec 06, 2022
A Player for Kanye West's Stem Player. Sort of an emulator.

Stem Player Player Stem Player Player Usage Download the latest release here Optional: install ffmpeg, instructions here NOTE: DOES NOT ENABLE DOWNLOA

119 Dec 28, 2022
NumQMBasic - A mini-course offered to Undergrad physics students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 35 Dec 05, 2022
piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

piSTAR Lab WARNING: This is an early release. Overview piSTAR Lab is a modular deep reinforcement learning platform built to make AI experimentation a

piSTAR Lab 0 Aug 01, 2022
Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Demetri Pananos 9 Oct 04, 2022
Solutions of Reinforcement Learning 2nd Edition

Solutions of Reinforcement Learning, An Introduction

YIFAN WANG 1.4k Dec 30, 2022
This folder contains the python code of UR5E's advanced forward kinematics model.

This folder contains the python code of UR5E's advanced forward kinematics model. By entering the angle of the joint of UR5e, the detailed coordinates of up to 48 points around the robot arm can be c

Qiang Wang 4 Sep 17, 2022
pip install python-office

🍬 python for office 👉 http://www.python4office.cn/ 👈 🌎 English Documentation 📚 简介 Python-office 是一个 Python 自动化办公第三方库,能解决大部分自动化办公的问题。而且每个功能只需一行代码,

程序员晚枫 272 Dec 29, 2022
The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble

Wordle RL The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble I know there are more deterministic

Aditya Arora 3 Feb 22, 2022
Generative Art Using Neural Visual Grammars and Dual Encoders

Generative Art Using Neural Visual Grammars and Dual Encoders Arnheim 1 The original algorithm from the paper Generative Art Using Neural Visual Gramm

DeepMind 231 Jan 05, 2023
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data

VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De

6 Dec 15, 2022
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
PyTorchMemTracer - Depict GPU memory footprint during DNN training of PyTorch

A Memory Tracer For PyTorch OOM is a nightmare for PyTorch users. However, most

Jiarui Fang 9 Nov 14, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Self-Supervised Vision Transformers with DINO PyTorch implementation and pretrained models for DINO. For details, see Emerging Properties in Self-Supe

Facebook Research 4.2k Jan 03, 2023
A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)

A Higher Performance Pytorch Implementation of DeepLab V3 Plus Introduction This repo is an (re-)implementation of Encoder-Decoder with Atrous Separab

linhua 326 Nov 22, 2022
yolov5目标检测模型的知识蒸馏(基于响应的蒸馏)

代码地址: https://github.com/Sharpiless/yolov5-knowledge-distillation 教师模型: python train.py --weights weights/yolov5m.pt \ --cfg models/yolov5m.ya

52 Dec 04, 2022
A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ReaLiSe ReaLiSe is a multi-modal Chinese spell checking model. This the office code for the paper Read, Listen, and See: Leveraging Multimodal Informa

DaDa 106 Dec 29, 2022