PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Overview

Future urban scene generation through vehicle synthesis

This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Through Vehicle Synthesis" [arXiv]

Model architecture

Our framework is composed by two stages:

  1. Interpretable information extraction: high level interpretable information is gathered from raw RGB frames (bounding boxes, trajectories, keypoints).
  2. Novel view completion: condition a reprojected 3D model with the original 2D appearance.

Multi stage pipeline

Abstract

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite recent advances, generating the entire scene in an end-to-end fashion is still far from being achieved. Instead, here we follow a two stage approach, where interpretable information are included in the loop and each actor is modelled independently. We leverage a per-object novel view synthesis paradigm; i.e. generating a synthetic representation of an object undergoing a geometrical roto-translation in the 3D space. Our model can be easily conditioned with constraints (e.g. input trajectories) provided by state-of-the-art tracking methods or by the user.

Sequence result example


Code

Code was tested with an Anaconda environment (Python version 3.6) on both Linux and Windows based systems.

Install

Run the following commands to install all requirements in a new virtual environment:

conda create -n <env_name> python=3.6
conda activate <env_name>
pip install -r requirements.txt

Install PyTorch package (version 1.3 or above).

How to run test

To run the demo of our project, please firstly download all the required data at this link and save them in a of your choice. We tested our pipeline on the Cityflow dataset that already have annotated bounding boxes and trajectories of vehicles.

The test script is run_test.py that expects some arguments as mandatory: video, 3D keypoints and checkpoints directories.

python run_test.py <data_dir>/<video_dir> <data_dir>/pascal_cads <data_dir>/checkpoints --det_mode ssd512|yolo3|mask_rcnn --track_mode tc|deepsort|moana --bbox_scale 1.15 --device cpu|cuda

Add the parameter --inpaint to use the inpainting on the vehicle instead of the static background.

Description and GUI usage

If everything went well, you should see the main GUI in which you can choose whichever vehicle you want that was detected in the video frame or change the video frame.

GUI window

The commands working on this window are:

  1. RIGHT ARROW = go to next frame
  2. LEFT ARROW = go to previous frame
  3. SINGLE MOUSE LEFT BUTTON CLICK = visualize car trajectory
  4. BACKSPACE = delete the drawn trajectories
  5. DOUBLE MOUSE LEFT BUTTON CLICK = select one of the vehicles bounding boxes

Once you selected some vehicles of your chioce by double-clicking in their bounding boxes, you can push the RUN button to start the inference. The resulting frames will be saved in ./results directory.

Cite

If you find this repository useful for your research, please cite the following paper:

@inproceedings{simoni2021future,
  title={Future urban scenes generation through vehicles synthesis},
  author={Simoni, Alessandro and Bergamini, Luca and Palazzi, Andrea and Calderara, Simone and Cucchiara, Rita},
  booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
  pages={4552--4559},
  year={2021},
  organization={IEEE}
}
Owner
Alessandro Simoni
PhD Student @ University of Modena and Reggio Emilia (@aimagelab)
Alessandro Simoni
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

TechSEO Crawler Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index. Play with the r

JR Oakes 57 Nov 24, 2022
RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds This repository contains the code asscoiated

Felix Hensel 14 Dec 12, 2022
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
Adversarial Graph Augmentation to Improve Graph Contrastive Learning

ADGCL : Adversarial Graph Augmentation to Improve Graph Contrastive Learning Introduction This repo contains the Pytorch [1] implementation of Adversa

susheel suresh 62 Nov 19, 2022
Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

MASTER-PyTorch PyTorch reimplementation of "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021). This projec

Wenwen Yu 255 Dec 29, 2022
AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

AtlasNet [Project Page] [Paper] [Talk] AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation Thibault Groueix, Matthew Fisher, Vladimir

577 Dec 17, 2022
ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

Dequan Wang 204 Dec 25, 2022
SimpleDepthEstimation - An unified codebase for NN-based monocular depth estimation methods

SimpleDepthEstimation Introduction This is an unified codebase for NN-based monocular depth estimation methods, the framework is based on detectron2 (

8 Dec 13, 2022
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning

HiEST 2 Sep 09, 2022
LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation Table of Contents: Introduction Project Structure Installation Datas

Yu Wang 492 Dec 02, 2022
PlaidML is a framework for making deep learning work everywhere.

A platform for making deep learning work everywhere. Documentation | Installation Instructions | Building PlaidML | Contributing | Troubleshooting | R

PlaidML 4.5k Jan 02, 2023
Meta Learning Backpropagation And Improving It (VSML)

Meta Learning Backpropagation And Improving It (VSML) This is research code for the NeurIPS 2021 publication Kirsch & Schmidhuber 2021. Many concepts

Louis Kirsch 22 Dec 21, 2022
Learning to Self-Train for Semi-Supervised Few-Shot

Learning to Self-Train for Semi-Supervised Few-Shot Classification This repository contains the TensorFlow implementation for NeurIPS 2019 Paper "Lear

86 Dec 29, 2022
CPF: Learning a Contact Potential Field to Model the Hand-object Interaction

Contact Potential Field This repo contains model, demo, and test codes of our paper: CPF: Learning a Contact Potential Field to Model the Hand-object

Lixin YANG 99 Dec 26, 2022
A Temporal Extension Library for PyTorch Geometric

Documentation | External Resources | Datasets PyTorch Geometric Temporal is a temporal (dynamic) extension library for PyTorch Geometric. The library

Benedek Rozemberczki 1.9k Jan 07, 2023
Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

RL_toolbox all the algorithm is running on pycharm IDE, or the package loss error may exist. implemented algorithm: trpo a3c a3c:for continous action

yupei.wu 44 Oct 10, 2022
Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval.

DARP-SBIR Intro This repository contains the source code implementation for ICDM submission paper Deep Reinforced Attention Regression for Partial Ske

2 Jan 09, 2022
Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

[AI6122] Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course

HT. Li 5 Sep 12, 2022