This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Last update: Jan 05, 2023

Overview

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices"

Introduction

This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Dependencies

This code is tested under Ubuntu 16.04, CUDA 11.2 environment with two NVIDIA RTX or V100 GPUs.

Python 3.6.5 version with virtualenv is used for development.

Directory

Root

The ${ROOT} is described as below.

${ROOT}
|-- data
|-- demo
|-- common
|-- main
|-- tool
|-- vis
`-- output

data contains data loading codes and soft links to images and annotations directories.
demo contains demo codes.
common contains kernel codes for 3d multi-person pose estimation system. Also custom backbone is implemented in this repo
main contains high-level codes for training or testing the network.
tool contains data pre-processing codes. You don't have to run this code. I provide pre-processed data below.
vis contains scripts for 3d visualization.
output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${POSE_ROOT}
|-- data
|   |-- Human36M
|   |   |-- bbox_root
|   |   |   |-- bbox_root_human36m_output.json
|   |   |-- images
|   |   |-- annotations
|   |-- MPII
|   |   |-- images
|   |   |-- annotations
|   |-- MSCOCO
|   |   |-- bbox_root
|   |   |   |-- bbox_root_coco_output.json
|   |   |-- images
|   |   |   |-- train2017
|   |   |   |-- val2017
|   |   |-- annotations
|   |-- MuCo
|   |   |-- data
|   |   |   |-- augmented_set
|   |   |   |-- unaugmented_set
|   |   |   |-- MuCo-3DHP.json
|   |-- MuPoTS
|   |   |-- bbox_root
|   |   |   |-- bbox_mupots_output.json
|   |   |-- data
|   |   |   |-- MultiPersonTestSet
|   |   |   |-- MuPoTS-3D.json

Download Human3.6M parsed data [data]
Download MPII parsed data [images][annotations]
Download MuCo parsed and composited data [data]
Download MuPoTS parsed data [images][annotations]
All annotation files follow MS COCO format.
If you want to add your own dataset, you have to convert it to MS COCO format.

Output

You need to follow the directory structure of the output folder as below.

${POSE_ROOT}
|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
`-- |-- vis

Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
log folder contains training log file.
model_dump folder contains saved checkpoints for each epoch.
result folder contains final estimation files generated in the testing stage.
vis folder contains visualized results.

3D visualization

Run $DB_NAME_img_name.py to get image file names in .txt format.
Place your test result files (preds_2d_kpt_$DB_NAME.mat, preds_3d_kpt_$DB_NAME.mat) in single or multi folder.
Run draw_3Dpose_$DB_NAME.m

Running 3DMPPE_POSENET

Requirements

cd main
pip install -r requirements.txt

Setup Training

In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.

Train

In the main folder, run

python train.py --gpu 0-1 --backbone LPSKI

to train the network on the GPU 0,1.

If you want to continue experiment, run

python train.py --gpu 0-1 --backbone LPSKI --continue

--gpu 0,1 can be used instead of --gpu 0-1.

Test

Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-1 --test_epoch 20-21 --backbone LPSKI

to test the network on the GPU 0,1 with 20th and 21th epoch trained model. --gpu 0,1 can be used instead of --gpu 0-1. For the backbone you can either choose BACKBONE_DICT = { 'LPRES':LpNetResConcat, 'LPSKI':LpNetSkiConcat, 'LPWO':LpNetWoConcat }

Human3.6M dataset using protocol 1

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

Human3.6M dataset using protocol 2

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

MuPoTS-3D dataset

For the evaluation, run test.py. After that, move data/MuPoTS/mpii_mupots_multiperson_eval.m in data/MuPoTS/data. Also, move the test result files (preds_2d_kpt_mupots.mat and preds_3d_kpt_mupots.mat) in data/MuPoTS/data. Then run mpii_mupots_multiperson_eval.m with your evaluation mode arguments.

TFLite inference

For the inference in mobile devices we also tested in mobile devices which converting PyTorch implementation through onnx and finally serving into TFlite. Official demo app is available in here

Reference

What this repo cames from: Training section and is based on following paper and github

PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019).
Flexible and simple code.
Compatibility for most of the publicly available 2D and 3D, single and multi-person pose estimation datasets including Human3.6M, MPII, MS COCO 2017, MuCo-3DHP and MuPoTS-3D.
Human pose estimation visualization code.

@InProceedings{Moon_2019_ICCV_3DMPPE,
  author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
  title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
  booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
  year = {2019}
}

This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Related tags

Overview

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices"

Introduction

Dependencies

Directory

Root

Data

Output

3D visualization

Running 3DMPPE_POSENET

Requirements

Setup Training

Train

Test

Human3.6M dataset using protocol 1

Human3.6M dataset using protocol 2

MuPoTS-3D dataset

TFLite inference

Reference

Owner

Choi Sang Bum

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

A model which classifies reviews as positive or negative.

Official Repository of NeurIPS2021 paper: PTR

Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure.

[WACV21] Code for our paper: Samuel, Atzmon and Chechik, "From Generalized zero-shot learning to long-tail with class descriptors"

Raindrop strategy for Irregular time series

Learning Logic Rules for Document-Level Relation Extraction

The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

HNN: Human (Hollywood) Neural Network

State of the Art Neural Networks for Deep Learning

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

PyTorch implementation of neural style randomization for data augmentation

Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

A collection of semantic image segmentation models implemented in TensorFlow

PyTorch implementation of the TTC algorithm

Audio Visual Emotion Recognition using TDA