Code for Paper: Self-supervised Learning of Motion Capture

Last update: Jul 25, 2022

Related tags

Overview

Self-supervised Learning of Motion Capture

This is code for the paper: Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, Katerina Fragkiadaki, Self-supervised Learning of Motion Capture, NIPS2017 (Spotlight)

Check the project page for more results.

Content

Environment setup and Dataset
Data preprocessing
Pretrained model and small tfrecords
Training
Citation
License

1. Environment setup and Dataset

python We use python2.7.13 from Anaconda and Tensorflow 1.1
SMPL model: We need rest body template from SMPL model.

You can download it from here.

SURREAL Dataset: If you plan to pretrain or test on surreal dataset.

Please download surreal from here

H36M Dataset: If you plan to test on real video with some groundtruth (to evaluate).

Please download H3.6M Dataset from here

2. Data preprocessing

Parse Surreal Dataset into binary files

In order to speed up the read write for tfrecords, we parse surreal dataset into binary files. Open file

data/preparsed/main_parse_surreal

and change the data path and output path.

Build up tfrecords

change the data path to the path you built in the previous step in

pack_data/pack_data_bin.py

and run it. You can specify how many examples you want to have in each tfrecords by changing value for num_samples. If "is_test" is False, we use sequences generated from actor 1, 5, 6, 7, 8 as training samples. If "is_test" is True, we use only sequence "" from actor 9 as validation. You can change this split by modifying the "get_file_list" function in tfrecords_utils.py

3. Pretrained model and small tfrecords

You can downdload a pretrained model using supervision from here surreal_quo0.tfrecords is a small training data and surreal2_100_test_quo1.tfrecords

Note: To make this code pack, I calculate 2d flow directly from 3d groundtruth during testing. But you should replace this with your own predicted flow and keypoints.

4. Train model

open up pretrained.sh, there is one commend for pretraining using supervision, and one commend for finetuning with testing data. Commend out the line that you need

Citation

If you use this code, please cite:

@incollection{NIPS2017_7108, title = {Self-supervised Learning of Motion Capture}, author = {Tung, Hsiao-Yu and Tung, Hsiao-Wei and Yumer, Ersin and Fragkiadaki, Katerina}, booktitle = {Advances in Neural Information Processing Systems 30}, editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett}, pages = {5236--5246}, year = {2017}, publisher = {Curran Associates, Inc.}, url = {http://papers.nips.cc/paper/7108-self-supervised-learning-of-motion-capture.pdf} }

Code for Paper: Self-supervised Learning of Motion Capture

Related tags

Overview

Self-supervised Learning of Motion Capture

Content

1. Environment setup and Dataset

2. Data preprocessing

3. Pretrained model and small tfrecords

4. Train model

Citation

Owner

Hsiao-Yu Fish Tung

A bare-bones Python library for quality diversity optimization.

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

PyTorch implementations of algorithms for density estimation

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Motion and Shape Capture from Sparse Markers

EmoTag helps you train emotion detection model for Chinese audios

A Python package for performing pore network modeling of porous media

Style transfer, deep learning, feature transform

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

Code for ViTAS_Vision Transformer Architecture Search

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Image-popularity-score - A novel deep regression method for image scoring.

Detecting drunk people through thermal images using Deep Learning (CNN)

Rainbow: Combining Improvements in Deep Reinforcement Learning