Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Overview

Oriented RepPoints for Aerial Object Detection

图片

The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”.

Introduction

Based on the Oriented Reppoints detector with Swin Transformer backbone, the 3rd Place is achieved on the Task 1 and the 2nd Place is achieved on the Task 2 of 2021 challenge of Learning to Understand Aerial Images (LUAI) held on ICCV’2021. The detailed information is introduced in this paper of "LUAI Challenge 2021 on Learning to Understand Aerial Images, ICCVW2021".

New Feature

  • BackBone: add Swin-Transformer, ReResNet
  • DataAug: add Mosaic4or9, Mixup, HSV, RandomPerspective, RandomScaleCrop DataAug out

Installation

Please refer to install.md for installation and dataset preparation.

Getting Started

This repo is based on mmdetection. Please see GetStart.md for the basic usage.

Results and Models

The results on DOTA test-dev set are shown in the table below(password:aabb/swin/ABCD). More detailed results please see the paper.

Model Backbone MS DataAug DOTAv1 mAP DOTAv2 mAP Download
OrientedReppoints R-50 - - 75.68 - baidu(aabb)
OrientedReppoints R-101 - 76.21 - baidu(aabb)
OrientedReppoints R-101 78.12 - baidu(aabb)
OrientedReppoints SwinT-tiny - - - -

ImageNet-1K and ImageNet-22K Pretrained Models

name pretrain resolution [email protected] [email protected] #params FLOPs FPS 22K model 1K model Need to turn read version
Swin-T ImageNet-1K 224x224 81.2 95.5 28M 4.5G 755 - github/baidu(swin)/config
Swin-S ImageNet-1K 224x224 83.2 96.2 50M 8.7G 437 - github/baidu(swin)/config
Swin-B ImageNet-1K 224x224 83.5 96.5 88M 15.4G 278 - github/baidu(swin)/config
Swin-B ImageNet-1K 384x384 84.5 97.0 88M 47.1G 85 - github/baidu(swin)/test-config
Swin-B ImageNet-22K 224x224 85.2 97.5 88M 15.4G 278 github/baidu(swin) github/baidu(swin)/test-config
Swin-B ImageNet-22K 384x384 86.4 98.0 88M 47.1G 85 github/baidu(swin) github/baidu(swin)/test-config
Swin-L ImageNet-22K 224x224 86.3 97.9 197M 34.5G 141 github/baidu(swin) github/baidu(swin)/test-config
Swin-L ImageNet-22K 384x384 87.3 98.2 197M 103.9G 42 github/baidu(swin) github/baidu(swin)/test-config
ReResNet50 ImageNet-1K 224x224 71.20 90.28 - - - - google/baidu(ABCD)/log -

The mAOE results on DOTAv1 val set are shown in the table below(password:aabb).

Model Backbone mAOE Download
OrientedReppoints R-50 5.93° baidu(aabb)

Note:

  • Wtihout the ground-truth of test subset, the mAOE of orientation evaluation is calculated on the val subset(original train subset for training).
  • The orientation (angle) of an aerial object is define as below, the detail of mAOE, please see the paper. The code of mAOE is mAOE_evaluation.py. 微信截图_20210522135042

Visual results

The visual results of learning points and the oriented bounding boxes. The visualization code is show_learning_points_and_boxes.py.

  • Learning points

Learning Points

  • Oriented bounding box

Oriented Box

Citation

@article{Li2021oriented,
  title={Oriented RepPoints for Aerial Object Detection},
  author={Wentong Li and Jianke Zhu},
  journal={arXiv preprint arXiv:2105.11111},
  year={2021}
}

Acknowledgements

I have used utility functions from other wonderful open-source projects. Espeicially thank the authors of:

OrientedRepPoints

Swin-Transformer-Object-Detection

ReDet

Minimalistic PyTorch training loop

Backbone for PyTorch training loop Will try to keep it minimalistic. pip install back from back import Bone Features Progress bar Checkpoints saving/l

Kashin 4 Jan 16, 2020
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

102 Dec 25, 2022
The code written during my Bachelor Thesis "Classification of Human Whole-Body Motion using Hidden Markov Models".

This code was written during the course of my Bachelor thesis Classification of Human Whole-Body Motion using Hidden Markov Models. Some things might

Matthias Plappert 14 Dec 06, 2022
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

data-services A repository for storing various Data Engineering docker-compose files in one place. How to use it ? Set the required settings in .env f

BigData.IR 525 Dec 03, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy for sma

THUDM 540 Dec 30, 2022
Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

advantage-weighted-regression Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning, by Peng et al. (

Omar D. Domingues 1 Dec 02, 2021
Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.

DuoRec Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. Usage Download datasets fr

Qrh 46 Dec 19, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022) Please cite "Independent SE(3)-Equivar

Octavian Ganea 154 Jan 02, 2023
OpenL3: Open-source deep audio and image embeddings

OpenL3 OpenL3 is an open-source Python library for computing deep audio and image embeddings. Please refer to the documentation for detailed instructi

Music and Audio Research Laboratory - NYU 326 Jan 02, 2023
Code Repository for The Kaggle Book, Published by Packt Publishing

The Kaggle Book Data analysis and machine learning for competitive data science Code Repository for The Kaggle Book, Published by Packt Publishing "Lu

Packt 1.6k Jan 07, 2023
This porject is intented to build the most accurate model for predicting the porbability of loan default

Estimating-Loan-Default-Probability IBA ML2 Mid-project / Kaggle Competition This porject is intented to build the most accurate model for predicting

Adil Gahramanov 1 Jan 24, 2022
QMagFace: Simple and Accurate Quality-Aware Face Recognition

Quality-Aware Face Recognition 26.11.2021 start readme QMagFace: Simple and Accurate Quality-Aware Face Recognition Research Paper Implementation - To

Philipp Terhörst 59 Jan 04, 2023
Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

Daniel Bogdoll 4 Jul 13, 2022
Data Engineering ZoomCamp

Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w

Aaron 61 Jan 06, 2023
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

haifeng xia 32 Oct 26, 2022
SMPL-X: A new joint 3D model of the human body, face and hands together

SMPL-X: A new joint 3D model of the human body, face and hands together [Paper Page] [Paper] [Supp. Mat.] Table of Contents License Description News I

Vassilis Choutas 1k Jan 09, 2023
A self-supervised 3D representation learning framework named viewpoint bottleneck.

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck Paper Created by Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou from Institute for AI In

63 Aug 11, 2022
Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Transformer Encoder Reasoning Network Code for the cross-modal visual-linguistic retrieval method from "Transformer Reasoning Network for Image-Text M

Nicola Messina 53 Dec 30, 2022
sense-py-AnishaBaishya created by GitHub Classroom

Compute Statistics Here we compute statistics for a bunch of numbers. This project uses the unittest framework to test functionality. Pass the tests T

1 Oct 21, 2021