Nvidia Semantic Segmentation monorepo

Last update: Jan 04, 2023

Related tags

Overview

Paper | YouTube | Cityscapes Score

Pytorch implementation of our paper Hierarchical Multi-Scale Attention for Semantic Segmentation.

Please refer to the sdcnet branch if you are looking for the code corresponding to Improving Semantic Segmentation via Video Prediction and Label Relaxation.

Installation

The code is tested with pytorch 1.3 and python 3.6
You can use ./Dockerfile to build an image.

Download Weights

Create a directory where you can keep large files. Ideally, not in this directory.

  > mkdir <large_asset_dir>

Update __C.ASSETS_PATH in config.py to point at that directory

__C.ASSETS_PATH=<large_asset_dir>
Download pretrained weights from google drive and put into <large_asset_dir>/seg_weights

Download/Prepare Data

If using Cityscapes, download Cityscapes data, then update config.py to set the path:

__C.DATASET.CITYSCAPES_DIR=<path_to_cityscapes>

Download Autolabelled-Data from google drive

If using Cityscapes Autolabelled Images, download Cityscapes data, then update config.py to set the path:

__C.DATASET.CITYSCAPES_CUSTOMCOARSE=<path_to_cityscapes>

If using Mapillary, download Mapillary data, then update config.py to set the path:

__C.DATASET.MAPILLARY_DIR=<path_to_mapillary>

Running the code

The instructions below make use of a tool called runx, which we find useful to help automate experiment running and summarization. For more information about this tool, please see runx. In general, you can either use the runx-style commandlines shown below. Or you can call python train.py <args ...> directly if you like.

Run inference on Cityscapes

Dry run:

> python -m runx.runx scripts/eval_cityscapes.yml -i -n

This will just print out the command but not run. It's a good way to inspect the commandline.

Real run:

> python -m runx.runx scripts/eval_cityscapes.yml -i

The reported IOU should be 86.92. This evaluates with scales of 0.5, 1.0. and 2.0. You will find evaluation results in ./logs/eval_cityscapes/...

Run inference on Mapillary

> python -m runx.runx scripts/eval_mapillary.yml -i

The reported IOU should be 61.05. Note that this must be run on a 32GB node and the use of 'O3' mode for amp is critical in order to avoid GPU out of memory. Results in logs/eval_mapillary/...

Dump images for Cityscapes

> python -m runx.runx scripts/dump_cityscapes.yml -i

This will dump network output and composited images from running evaluation with the Cityscapes validation set.

Run inference and dump images on a folder of images

> python -m runx.runx scripts/dump_folder.yml -i

You should end up seeing images that look like the following:

Train a model

Train cityscapes, using HRNet + OCR + multi-scale attention with fine data and mapillary-pretrained model

> python -m runx.runx scripts/train_cityscapes.yml -i

The first time this command is run, a centroid file has to be built for the dataset. It'll take about 10 minutes. The centroid file is used during training to know how to sample from the dataset in a class-uniform way.

This training run should deliver a model that achieves 84.7 IOU.

Train SOTA default train-val split

> python -m runx.runx  scripts/train_cityscapes_sota.yml -i

Again, use -n to do a dry run and just print out the command. This should result in a model with 86.8 IOU. If you run out of memory, try to lower the crop size or turn off rmi_loss.

Nvidia Semantic Segmentation monorepo

Related tags

Overview

Paper | YouTube | Cityscapes Score

Installation

Download Weights

Download/Prepare Data

Running the code

Run inference on Cityscapes

Run inference on Mapillary

Dump images for Cityscapes

Run inference and dump images on a folder of images

Train a model

Train SOTA default train-val split

Owner

NVIDIA Corporation

Session-aware Item-combination Recommendation with Transformer Network

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

FairyTailor: Multimodal Generative Framework for Storytelling

Official PyTorch implementation of StyleGAN3

Car Parking Tracker Using OpenCv

ScaleNet: A Shallow Architecture for Scale Estimation

Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

PyImpetus is a Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features

A project to make Amazon Echo respond to sign language using your webcam

Implementation for Shape from Polarization for Complex Scenes in the Wild

RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?

A Unified Generative Framework for Various NER Subtasks.

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Ontologysim: a Owlready2 library for applied production simulation

Train DeepLab for Semantic Image Segmentation

PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ