Face Transformer for Recognition

Last update: Nov 30, 2022

Related tags

Overview

Face-Transformer

This is the code of Face Transformer for Recognition (https://arxiv.org/abs/2103.14803v2).

Recently there has been great interests of Transformer not only in NLP but also in computer vision. We wonder if transformer can be used in face recognition and whether it is better than CNNs. Therefore, we investigate the performance of Transformer models in face recognition. The models are trained on a large scale face recognition database MS-Celeb-1M and evaluated on several mainstream benchmarks, including LFW, SLLFW, CALFW, CPLFW, TALFW, CFP-FP, AGEDB and IJB-C databases. We demonstrate that Transformer models achieve comparable performance as CNN with similar number of parameters and MACs.

Usage Instructions

1. Preparation

The code is mainly adopted from Vision Transformer, and DeiT. In addition to PyTorch and torchvision, install vit_pytorch by Phil Wang, and package timm==0.3.2 by Ross Wightman. Sincerely appreciate for their contributions.

pip install vit-pytorch

pip install timm==0.3.2

Copy the files of fold "copy-to-vit_pytorch-path" to vit-pytorch path.

.
├── __init__.py
├── vit_face.py
└── vits_face.py

2. Databases

You can download the training databases, MS-Celeb-1M (version ms1m-retinaface), and put it in folder 'Data'.

You can download the testing databases as follows and put them in folder 'eval'.

LFW: Baidu Netdisk(password: dfj0), Google Drive
SLLFW: Baidu Netdisk(password: l1z6), Google Drive
CALFW: Baidu Netdisk(password: vvqe), Google Drive
CPLFW: Baidu Netdisk(password: jyp9), Google Drive
TALFW: Baidu Netdisk(password: izrg), Google Drive
CFP_FP: Baidu Netdisk(password: 4fem), Google Drive--refer to Insightface
AGEDB: Baidu Netdisk(password: rlqf), Google Drive--refer to Insightface

3. Train Models

ViT-P8S8

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s1 --warmup-epochs 1 --lr 3e-4 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s2 --warmup-epochs 0 --lr 1e-4 -r path_to_model 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s3 --warmup-epochs 0 --lr 5e-5 -r path_to_model

ViT-P12S8

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s1 --warmup-epochs 1 --lr 3e-4 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s2 --warmup-epochs 0 --lr 1e-4 -r path_to_model 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s3 --warmup-epochs 0 --lr 5e-5 -r path_to_model

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

You can download the following models

ViT-P8S8: Baidu Netdisk(password: spkf), Google Drive
ViT-P12S8: Baidu Netdisk(password: 7caa), Google Drive

You can test Models

python test.py --model ./results/ViT-P12S8_ms1m_cosface/Backbone_VITs_Epoch_2_Batch_12000_Time_2021-03-17-04-05_checkpoint.pth --network VIT 

python test.py --model ./results/ViT-P12S8_ms1m_cosface/Backbone_VITs_Epoch_2_Batch_12000_Time_2021-03-17-04-05_checkpoint.pth --network VITs

Face Transformer for Recognition

Related tags

Overview

Face-Transformer

Usage Instructions

1. Preparation

2. Databases

3. Train Models

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

Owner

Zhong Yaoyao

Python and Julia in harmony.

This code implements constituency parse tree aggregation

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Neural Network to colorize grayscale images

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

CVPR 2021

Neural network chess engine trained on Gary Kasparov's games.

Personalized Federated Learning using Pytorch (pFedMe)

IndoNLI: A Natural Language Inference Dataset for Indonesian

Sequence to Sequence (seq2seq) Recurrent Neural Network (RNN) for Time Series Forecasting

Complete U-net Implementation with keras

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

RTSeg: Real-time Semantic Segmentation Comparative Study

ICLR2021 (Under Review)

A simple code to perform canny edge contrast detection on images.

Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion

Deep motion transfer

IDRLnet, a Python toolbox for modeling and solving problems through Physics-Informed Neural Network (PINN) systematically.