PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Last update: Dec 16, 2022

Related tags

Deep Learning R2Plus1D-PyTorch

Overview

R2Plus1D-PyTorch

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Link to original: paper and code

NOTE: This repository has been archived, although forks and other work that extend on top of this remain welcome

Requirements

R2Plus1D-PyTorch has the following requirements

PyTorch 0.4 and dependencies
OpenCV (tested on 3.4.0.12)
tqdm (for progress bars)

About this repository

This repository consists of four python files:

module.py - Contains an implementation of the factored, R2Plus1D convolution the entire implementation is based around. It is designed to be a replacement for nn.Conv3D in the appropriate scenario
network.py - Uses module.py to build up the residual network described in the paper
dataset.py - Implements a PyTorch dataset, that can load videos with appropriate labels from a given directory.
trainer.py - A mildly modified version of the script from the PyTorch tutorials to train the model. Features saving and restoring capabilities.

Training on Kinetics-400/600

This repository does not include a crawler or downloader for the Kinetics-400/600 dataset, however, one can be found here. It is strongly recommended to downsample the videos prior to training (and not on the fly), using a tool such as ffmpeg. If using the crawler, this can be done by adding "-vf", "scale=172:128" to the ffmpeg command list in the download clip function.

Training in general

This repository is designed for the ResNet to be trained on any dataset of videos in general, using the VideoDataloader class from dataset.py . It expects the videos to be arranged in a directory -> [train/val] folders -> [class_label] folders (one for each class) -> videos (the files themselves).

Forks and fixes of this repo are highly welcome!

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Related tags

Overview

R2Plus1D-PyTorch

Requirements

About this repository

Training on Kinetics-400/600

Training in general

Owner

Irhum Shafkat

Pytorch implementation of MLP-Mixer with loading pre-trained models.

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Training RNNs as Fast as CNNs

Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

Python Blood Vessel Topology Analysis

Grammar Induction using a Template Tree Approach

Good Classification Measures and How to Find Them

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

a short visualisation script for pyvideo data

(AAAI 2021) Progressive One-shot Human Parsing

《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

My usage of Real-ESRGAN to upscale anime, some test and results in the test_img folder

A smart Chat bot that can help to know about corona virus and Make prediction of corona using X-ray.

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

DAN: Unfolding the Alternating Optimization for Blind Super Resolution

This repository will be a summary and outlook on all our open, medical, AI advancements.

Udacity's CS101: Intro to Computer Science - Building a Search Engine

Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval.

Introducing neural networks to predict stock prices