Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Overview

Progressive Transformers for End-to-End Sign Language Production

Source code for "Progressive Transformers for End-to-End Sign Language Production" (Ben Saunders, Necati Cihan Camgoz, Richard Bowden - ECCV 2020)

Conference video available at https://twitter.com/BenMSaunders/status/1336638886198521857

Usage

Install required packages using the requirements.txt file.

pip install -r requirements.txt

To run, start main.py with arguments "train" and ".\Configs\Base.yaml":

python __main__.py train ./Configs/Base.yaml

An example train.log file can be found in ".\Configs\train.log" and a validation file at ".\Configs\validations.txt"

Back Translation model created from https://github.com/neccam/slt. Back Translation evaluation code coming soon.

Data

Pre-processed Phoenix14T data can be requested via email at [email protected]. If you wish to create the data yourself, please follow below:

Phoenix14T data can be downloaded from https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX-2014-T/ and skeleton joints can be extracted using OpenPose at https://github.com/CMU-Perceptual-Computing-Lab/openpose and lifted to 3D using the 2D to 3D Inverse Kinematics code at https://github.com/gopeith/SignLanguageProcessing under 3DposeEstimator.

Prepare Phoenix14T (or other sign language dataset) data as .txt files for .skel, .gloss, .txt and .files. Data format should be parallel .txt files for "src", "trg" and "files", with each line representing a new sequence:

  • The "src" file contains source sentences, with each line representing new sentence.

  • The "trg" file contains skeleton data of each frame, with a space separating frames. The joints should be divided by 3 to match the scaling I used. Each frame contains 150 joint values and a subsequent counter value, all separated by a space. Each sequence should be separated with a new line. If your data contains 150 joints per frame, please ensure that trg_size is set to 150 in the config file.

  • The "files" file should contain the name of each sequence on a new line.

Examples can be found in /Data/tmp. Data path must be specified in config file.

Pre-Trained Model

A pre-trained Progressive Transformer checkpoint can be downloaded from https://www.dropbox.com/s/l4xmnybp7luz0l3/PreTrained_PTSLP_Model.ckpt?dl=0.

This model has a size of num_layers: 2, num_heads: 4 and embedding_dim: 512, as outlined in ./Configs/Base.yaml. It has been pre-trained on the full PHOENIX14T dataset with the data format as above. The relevant train.log and validations.txt files can be found in .\Configs.

To initialise a model from this checkpoint, pass the --ckpt ./PreTrained_PTSLP_Model.ckpt argument to either train or test modes. Additionally, to initialise the correct src_embed size, the config argument src_vocab: "./Configs/src_vocab.txt" must be set to the location of the src_vocab.txt, found under ./Configs. Please open an issue if this checkpoint cannot be downloaded or loaded.

Reference

If you use this code in your research, please cite the following papers:

@inproceedings{saunders2020progressive,
	title		=	{{Progressive Transformers for End-to-End Sign Language Production}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{Proceedings of the European Conference on Computer Vision (ECCV)},
	year		=	{2020}}

@inproceedings{saunders2020adversarial,
	title		=	{{Adversarial Training for Multi-Channel Sign Language Production}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{Proceedings of the British Machine Vision Conference (BMVC)},
	year		=	{2020}}

@inproceedings{saunders2021continuous,
	title		=	{{Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{International Journal of Computer Vision (IJCV)},
	year		=	{2021}}

Acknowledgements

This work received funding from the SNSF Sinergia project 'SMILE' (CRSII2 160811), the European Union's Horizon2020 research and innovation programme under grant agreement no. 762021 'Content4All' and the EPSRC project 'ExTOL' (EP/R03298X/1). This work reflects only the authors view and the Commission is not responsible for any use that may be made of the information it contains. We would also like to thank NVIDIA Corporation for their GPU grant.

Owner
PhD Student at University of Surrey Researching Sign Language Production with Computer Vision & Natural Language Processing
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
Deep Probabilistic Programming Course @ DIKU

Deep Probabilistic Programming Course @ DIKU

52 May 14, 2022
A flexible ML framework built to simplify medical image reconstruction and analysis experimentation.

meddlr Getting Started Meddlr is a config-driven ML framework built to simplify medical image reconstruction and analysis problems. Installation To av

Arjun Desai 36 Dec 16, 2022
Using pretrained language models for biomedical knowledge graph completion.

LMs for biomedical KG completion This repository contains code to run the experiments described in: Scientific Language Models for Biomedical Knowledg

Rahul Nadkarni 41 Nov 30, 2022
Match SafeGraph POIs with Data collected through a cultural resource survey in Washington DC.

Match SafeGraph POI data with Cultural Resource Places in Washington DC Match SafeGraph POIs with Data collected through a cultural resource survey in

Changjie Chen 1 Jan 05, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
SynNet - synthetic tree generation using neural networks

SynNet This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can s

Wenhao Gao 60 Dec 29, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) 👚 [Paper] 👖 [Webpage] 👗 [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

Aiyu Cui 277 Dec 28, 2022
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
PyTorch implementation of DreamerV2 model-based RL algorithm

PyDreamer Reimplementation of DreamerV2 model-based RL algorithm in PyTorch. The official DreamerV2 implementation can be found here. Features ... Run

118 Dec 15, 2022
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021. For details of the model and experiments, please see our paper.

tricktreat 87 Dec 16, 2022
PIXIE: Collaborative Regression of Expressive Bodies

PIXIE: Collaborative Regression of Expressive Bodies [Project Page] This is the official Pytorch implementation of PIXIE. PIXIE reconstructs an expres

Yao Feng 331 Jan 04, 2023
Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks Novel and high-performance medical ima

14 Dec 18, 2022
Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

Meta-SparseINR Official PyTorch implementation of "Meta-learning Sparse Implicit Neural Representations" (NeurIPS 2021) by Jaeho Lee*, Jihoon Tack*, N

Jaeho Lee 41 Nov 10, 2022
The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds The why Im

3 Mar 29, 2022
Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

DV Lab 410 Jan 03, 2023
TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

TransMVSNet This repository contains the official implementation of the paper: "TransMVSNet: Global Context-aware Multi-view Stereo Network with Trans

旷视研究院 3D 组 155 Dec 29, 2022