STAR

Official implementation of Sparse Transformer-based Action Recognition

Dataset

download NTU RGB+D 60 action recognition of 2D/3D skeleton from http://rose1.ntu.edu.sg/datasets/actionRecognition.asp

or use google drive

NTU60 NTU120

uzip data as the following file structure: $(project_folder)/raw/.\*skeleton or $(project_folder)/dataset/raw/.\*skeleton (create "raw" folder under $(project_folder) or $(project_folder)/dataset then put raw skeleton files under "raw" folder)

run the code below to generate dataset:

python datagen.py

Training

git fetch and checkout to "distributed" branch

python train_dist.py -#distributed training

Configuration

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),  # or dataset_root=osp.join(os.getcwd(), 'dataset')
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=64,
                    out_channels=64,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=128,
                    out_channels=128,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

Official implementation of Sparse Transformer-based Action Recognition

Related tags

Overview

STAR

Dataset

Training

Configuration

Owner

Chonghan_Lee

Random Forests for Regression with Missing Entries

Naszilla is a Python library for neural architecture search (NAS)

This is the official repository of XVFI (eXtreme Video Frame Interpolation)

Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

HybVIO visual-inertial odometry and SLAM system

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

A nutritional label for food for thought.

PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

UFPR-ADMR-v2 Dataset

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

Discord Multi Tool that focuses on design and easy usage

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

Social Network Ads Prediction

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

Masked regression code - Masked Regression

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Python Implementation of Chess Playing AI with variable difficulty