Learning Representational Invariances for Data-Efficient Action Recognition

Last update: Nov 22, 2022

Overview

Learning Representational Invariances for Data-Efficient Action Recognition

Official PyTorch implementation for Learning Representational Invariances for Data-Efficient Action Recognition. We follow the code structure of MMAction2.

See the project page for more details.

Installation

We use PyTorch-1.6.0 with CUDA-10.2 and Torchvision-0.7.0.

Please refer to install.md for installation.

Data Preparation

First, please download human detection results and put them in the corresponding folder under data: UCF-101, HMDB-51, Kinetics-100.

Second, please refer to data_preparation.md to prepare raw frames of UCF-101 and HMDB-51. (Instructions of extracting frames from Kinetics-100 will be available soon.)

(Optional) You can download the pre-extracted ImageNet scores: UCF-101, HMDB-51.

Training

We use 8 RTX2080 Ti GPUs to run our experiments. You would need to adjust your training schedule accordingly if you have less GPUs. Please refer to here.

Supervised learning

PORT=${PORT:-29500}

python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$PORT \
tools/train.py \
$CONFIG \
--launcher pytorch ${@:3} \
--validate

You need to replace $CONFIG with the actual config file:

For supervised baseline, please use config files in configs/recognition/r2plus1d.
For strongly-augmented supervised learning, please use config files in configs/supervised_aug.

Semi-supervised learning

PORT=${PORT:-29500}

python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$PORT \
tools/train_semi.py \
$CONFIG \
--launcher pytorch ${@:3} \
--validate

You need to replace $CONFIG with the actual config file:

For single dataset semi-supervised learning, please use config files in configs/semi.
For cross-dataset semi-supervised learning, please use config files in configs/semi_both.

Testing

# Multi-GPU testing
./tools/dist_test.sh $CONFIG ${path_to_your_ckpt} ${num_of_gpus} --eval top_k_accuracy

# Single-GPU testing
python tools/test.py $CONFIG ${path_to_your_ckpt} --eval top_k_accuracy

NOTE: Do not use multi-GPU testing if you are currently using multi-GPU training.

Other details

Please see getting_started.md for the basic usage of MMAction2.

Acknowledgement

Codes are built upon MMAction2.

Learning Representational Invariances for Data-Efficient Action Recognition

Related tags

Overview

Learning Representational Invariances for Data-Efficient Action Recognition

Installation

Data Preparation

Training

Supervised learning

Semi-supervised learning

Testing

Other details

Acknowledgement

Owner

Virginia Tech Vision and Learning Lab

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

HyperDict - Self linked dictionary in Python

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

🕵 Artificial Intelligence for social control of public administration

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

This is the pytorch implementation of the paper - Axiomatic Attribution for Deep Networks.

Code for Towards Streaming Perception (ECCV 2020) :car:

(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

Relative Positional Encoding for Transformers with Linear Complexity

GeDML is an easy-to-use generalized deep metric learning library

(under submission) Bayesian Integration of a Generative Prior for Image Restoration

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Heart Arrhythmia Classification

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)