Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Last update: Jan 02, 2023

Related tags

Deep Learning Mask2Former

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar

[arXiv] [Project] [BibTeX]

Features

A single architecture for panoptic, instance and semantic segmentation.
Support major segmentation datasets: ADE20K, Cityscapes, COCO, Mapillary Vistas.

Installation

See installation instructions.

Getting Started

See Preparing Datasets for Mask2Former.

See Getting Started with Mask2Former.

Advanced usage

See Advanced Usage of Mask2Former.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the Mask2Former Model Zoo.

License

Shield:

The majority of Mask2Former is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license, Deformable-DETR is licensed under the Apache-2.0 License.

Citing Mask2Former

If you use Mask2Former in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@article{cheng2021mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
  journal={arXiv},
  year={2021}
}

If you find the code useful, please also consider the following BibTeX entry.

@inproceedings{cheng2021maskformer,
  title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
  author={Bowen Cheng and Alexander G. Schwing and Alexander Kirillov},
  journal={NeurIPS},
  year={2021}
}

Acknowledgement

Code is largely based on MaskFormer (https://github.com/facebookresearch/MaskFormer).

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Related tags

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Features

Installation

Getting Started

Advanced usage

Model Zoo and Baselines

License

Citing Mask2Former

Acknowledgement

Owner

Meta Research

A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank

Implementation of Pix2Seq in PyTorch

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

MLSpace: Hassle-free machine learning & deep learning development

PyGCL: Graph Contrastive Learning Library for PyTorch

Vehicle direction identification consists of three module detection , tracking and direction recognization.

Pytorch implementation of CVPR2020 paper “VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation”

TensorFlow implementation of "Variational Inference with Normalizing Flows"

Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

PyTorch code of my WACV 2022 paper Improving Model Generalization by Agreement of Learned Representations from Data Augmentation

The FIRST GANs-based omics-to-omics translation framework

A custom DeepStack model for detecting 16 human actions.

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

LyaNet: A Lyapunov Framework for Training Neural ODEs

Representing Long-Range Context for Graph Neural Networks with Global Attention

A vision library for performing sliced inference on large images/small objects