Learning Spatio-Temporal Transformer for Visual Tracking

Related tags

Deep LearningStark
Overview

STARK

PWC
PWC
PWC

The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking

Hiring research interns for visual transformer projects: [email protected]

STARK_Framework

Highlights

End-to-End, Post-processing Free

STARK is an end-to-end tracking approach, which directly predicts one accurate bounding box as the tracking result.
Besides, STARK does not use any hyperparameters-sensitive post-processing, leading to stable performances.

Real-Time Speed

STARK-ST50 and STARK-ST101 run at 40FPS and 30FPS respectively on a Tesla V100 GPU.

Strong performance

Tracker LaSOT (AUC) GOT-10K (AO) TrackingNet (AUC)
STARK 67.1 68.8 82.0
TransT 64.9 67.1 81.4
TrDiMP 63.7 67.1 78.4
Siam R-CNN 64.8 64.9 81.2

Purely PyTorch-based Code

STARK is implemented purely based on the PyTorch.

Install the environment

Option1: Use the Anaconda

conda create -n stark python=3.6
conda activate stark
bash install.sh

Option2: Use the docker file

We provide the complete docker at here

Data Preparation

Put the tracking datasets in ./data. It should look like:

${STARK_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- images
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train STARK

Training with multiple GPUs using DDP

# STARK-S50
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-S50
# STARK-ST50
python tracking/train.py --script stark_st1 --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST50 Stage1
python tracking/train.py --script stark_st2 --config baseline --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline  # STARK-ST50 Stage2
# STARK-ST101
python tracking/train.py --script stark_st1 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST101 Stage1
python tracking/train.py --script stark_st2 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline_R101  # STARK-ST101 Stage2

(Optionally) Debugging training with a single GPU

python tracking/train.py --script stark_s --config baseline --save_dir . --mode single

Test and evaluate STARK on benchmarks

  • LaSOT
python tracking/test.py stark_st baseline --dataset lasot --threads 32
python tracking/analysis_results.py # need to modify tracker configs and names
  • GOT10K-test
python tracking/test.py stark_st baseline_got10k_only --dataset got10k_test --threads 32
python lib/test/utils/transform_got10k.py --tracker_name stark_st --cfg_name baseline_got10k_only
  • TrackingNet
python tracking/test.py stark_st baseline --dataset trackingnet --threads 32
python lib/test/utils/transform_trackingnet.py --tracker_name stark_st --cfg_name baseline
  • VOT2020
    Before evaluating "STARK+AR" on VOT2020, please install some extra packages following external/AR/README.md
cd external/vot20/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh
  • VOT2020-LT
cd external/vot20_lt/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh

Test FLOPs, Params, and Speed

# Profiling STARK-S50 model
python tracking/profile_model.py --script stark_s --config baseline
# Profiling STARK-ST50 model
python tracking/profile_model.py --script stark_st2 --config baseline
# Profiling STARK-ST101 model
python tracking/profile_model.py --script stark_st2 --config baseline_R101

Model Zoo

The trained models, the training logs, and the raw tracking results are provided in the model zoo

Acknowledgments

Owner
Multimedia Research
Multimedia Research at Microsoft Research Asia
Multimedia Research
QuALITY: Question Answering with Long Input Texts, Yes!

QuALITY: Question Answering with Long Input Texts, Yes! Authors: Richard Yuanzhe Pang,* Alicia Parrish,* Nitish Joshi,* Nikita Nangia, Jason Phang, An

ML² AT CILVR 61 Jan 02, 2023
Liver segmentation using MONAI and pytorch

Machine Learning use case in the field of Healthcare. In this project MONAI and pytorch frameworks are used for 3D Liver segmentation.

Abhishek Gajbhiye 2 May 30, 2022
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi

18 Oct 20, 2022
This tutorial aims to learn the basics of deep learning by hands, and master the basics through combination of lectures and exercises

2021-Deep-learning This tutorial aims to learn the basics of deep learning by hands, and master the basics through combination of paper and exercises.

108 Feb 24, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 03, 2023
Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

2 Dec 22, 2021
DTCN IJCAI - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

Bobby 2 Jan 24, 2022
A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules

CapsNet-Tensorflow A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules Notes: The current version

Huadong Liao 3.8k Dec 29, 2022
[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Balanced MSE Code for the paper: Balanced MSE for Imbalanced Visual Regression Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu CVPR 2022 (Oral) News

Jiawei Ren 267 Jan 01, 2023
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
A big endian Gentoo port developed on a Pine64.org RockPro64

Gentoo-aarch64_be A big endian Gentoo port developed on a Pine64.org RockPro64 The endian wars are over... little endian won. As a result, it is incre

Rory Bolt 6 Dec 07, 2022
Repository for self-supervised landmark discovery

self-supervised-landmarks Repository for self-supervised landmark discovery Requirements pytorch pynrrd (for 3d images) Usage The use of this models i

Riddhish Bhalodia 2 Apr 18, 2022
High level network definitions with pre-trained weights in TensorFlow

TensorNets High level network definitions with pre-trained weights in TensorFlow (tested with 2.1.0 = TF = 1.4.0). Guiding principles Applicability.

Taehoon Lee 1k Dec 13, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 114 Jan 06, 2023
[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

Few-shot Deep HDR Deghosting This repository contains code and pretrained models for our paper: Self-Gated Memory Recurrent Network for Efficient Scal

Susmit Agrawal 4 Dec 29, 2021
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame

Yaroslava Lochman 55 Dec 30, 2022
Fast, Attemptable Route Planner for Navigation in Known and Unknown Environments

FAR Planner uses a dynamically updated visibility graph for fast replanning. The planner models the environment with polygons and builds a global visi

Fan Yang 346 Dec 30, 2022