[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

Last update: Jan 04, 2023

Overview

TANDEM: Tracking and Dense Mapping
in Real-time using Deep Multi-view Stereo

Lukas Koestler^1* Nan Yang^1,2*,† Niclas Zeller^2,3 Daniel Cremers^1,2

^*equal contribution ^†corresponding author

¹Technical University of Munich ²Artisense
³Karlsruhe University of Applied Sciences

Conference on Robot Learning (CoRL) 2021, London, UK

3DV 2021 Best Demo Award

arXiv | Video | OpenReview | Project Page

Code and Data

📣 CVA-MVSNet released! Please check cva_mvsnet/.
📣 Replica training data released! Please check replica/.
C++ code realse before Christmas. Thank you for your patience!

Abstract

In this paper, we present TANDEM a real-time monocular tracking and dense mapping framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of keyframes. To increase the robustness, we propose a novel tracking front-end that performs dense direct image alignment using depth maps rendered from a global model that is built incrementally from dense depth predictions. To predict the dense depth maps, we propose Cascade View-Aggregation MVSNet (CVA-MVSNet) that utilizes the entire active keyframe window by hierarchically constructing 3D cost volumes with adaptive view aggregation to balance the different stereo baselines between the keyframes. Finally, the predicted depth maps are fused into a consistent global map represented as a truncated signed distance function (TSDF) voxel grid. Our experimental results show that TANDEM outperforms other state-of-the-art traditional and learning-based monocular visual odometry (VO) methods in terms of camera tracking. Moreover, TANDEM shows state-of-the-art real-time 3D reconstruction performance.

[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

Related tags

Overview

TANDEM: Tracking and Dense Mapping
in Real-time using Deep Multi-view Stereo

Code and Data

Abstract

Poster

Owner

TUM Computer Vision Group

Cooperative Driving Dataset: a dataset for multi-agent driving scenarios

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

Python binding for Khiva library.

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Generalized Random Forests

Open-World Entity Segmentation

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Release of the ConditionalQA dataset

[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Cross-modal Deep Face Normals with Deactivable Skip Connections

Learning To Have An Ear For Face Super-Resolution

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

Source code for Zalo AI 2021 submission

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

A framework for multi-step probabilistic time-series/demand forecasting models

[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

Related tags

Overview

TANDEM: Tracking and Dense Mappingin Real-time using Deep Multi-view Stereo

Code and Data

Abstract

Poster

Owner

TUM Computer Vision Group

Cooperative Driving Dataset: a dataset for multi-agent driving scenarios

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

Python binding for Khiva library.

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Generalized Random Forests

Open-World Entity Segmentation

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Release of the ConditionalQA dataset

[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Cross-modal Deep Face Normals with Deactivable Skip Connections

Learning To Have An Ear For Face Super-Resolution

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

Source code for Zalo AI 2021 submission

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

A framework for multi-step probabilistic time-series/demand forecasting models

TANDEM: Tracking and Dense Mapping
in Real-time using Deep Multi-view Stereo