Pytorch implementation of few-shot semantic image synthesis

Last update: Sep 26, 2022

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Our method can synthesize photorealistic images from dense or sparse semantic annotations using a few training pairs and a pre-trained StyleGAN.

Prerequisites

Python3
PyTorch

Preparation

Download and decompress the file containing StyleGAN pre-trained models and put the "pretrained_models" directory in the parent directory.

Inference with our pre-trained models

Download and decompress the file containing our pretrained encoders and put the "results" directory in the parent directory.
For example, our results for celebaMaskHQ in a one-shot setting can be generated as follows:

python scripts/inference.py --exp_dir=results/celebaMaskHQ_oneshot --checkpoint_path=results/celebaMaskHQ_oneshot/checkpoints/iteration_100000.pt --data_path=./data/CelebAMask-HQ/test/labels/ --couple_outputs --latent_mask=8,9,10,11,12,13,14,15,16,17

Inference results are generated in results/celebaMaskHQ_oneshot. If you use other datasets, please specify --exp_dir, --checkpoint_path, and --data_path appropriately.

Training

For each dataset, you can train an encoder as follows:

CelebAMask

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_seg_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 19 --input_nc 19

CelebALandmark

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_landmark_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 71 --input_nc 71 --sparse_labeling

Intermediate training outputs with the StyleGAN pre-trained with the CelebA-HQ dataset. It can be seen that the layouts of the bottom-row images reconstructed from the middle-row pseudo semantic masks gradually become close to those of the top-row StyleGAN samples as the training iterations increase.

LSUN church

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsunchurch_seg_to_img --stylegan_weights pretrained_models/stylegan2-church-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 151 --input_nc 151

LSUN car

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncar_seg_to_img --stylegan_weights pretrained_models/stylegan2-car-config-f.pt --style_num 16 --start_from_latent_avg --label_nc 5 --input_nc 5

LSUN cat

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncat_scribble_to_img --stylegan_weights pretrained_models/stylegan2-cat-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 9 --input_nc 9 --sparse_labeling

Ukiyo-e

python scripts/train.py --exp_dir=[result_dir] --dataset_type=ukiyo-e_scribble_to_img --stylegan_weights pretrained_models/ukiyoe-256-slim-diffAug-002789.pt --style_num 14 --channel_multiplier 1 --start_from_latent_avg --label_nc 8 --input_nc 8 --sparse_labeling

Anime

python scripts/train.py --exp_dir=[result_dir] --dataset_type=anime_cross_to_img --stylegan_weights pretrained_models/2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pt --style_num 16 --start_from_latent_avg --label_nc 2 --input_nc 2 --sparse_labeling

Using StyleGAN samples as few-shot training data

Run the following script:

python scripts/generate_stylegan_samples.py --exp_dir=[result_dir] --stylegan_weights ./pretrained_models/stylegan2-ffhq-config-f.pt --style_num 18 --channel_multiplier 2

Then a StyleGAN image (*.png) and a corresponding latent code (*.pt) are obtained in [result_dir]/data/images and [result_dir]/checkpoints.

Manually annotate the generated image in [result_dir]/data/images and save the annotated mask in [result_dir]/data/labels.
Edit ./config/data_configs.py and ./config/paths_config.py appropriately to use the annotated pairs as a training set.
Run a training command above with appropriate options.

Citation

Please cite our paper if you find the code useful:

@article{endo2021fewshotsmis,
  title = {Few-shot Semantic Image Synthesis Using StyleGAN Prior},
  author = {Yuki Endo and Yoshihiro Kanamori},
  journal   = {CoRR},
  volume    = {abs/2103.14877},
  year      = {2021}
}

Acknowledgements

This code heavily borrows from the pixel2style2pixel repository.

Pytorch implementation of few-shot semantic image synthesis

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Prerequisites

Preparation

Inference with our pre-trained models

Training

Using StyleGAN samples as few-shot training data

Citation

Acknowledgements

Owner

Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ICCV2021 Expert-Goal Trajectory Prediction

PyTorch implementation of Neural Dual Contouring.

This is a collection of our NAS and Vision Transformer work.

A Python module for the generation and training of an entry-level feedforward neural network.

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

NBEATSx: Neural basis expansion analysis with exogenous variables

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

[TIP2020] Adaptive Graph Representation Learning for Video Person Re-identification

Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

FreeSOLO for unsupervised instance segmentation, CVPR 2022

Genetic feature selection module for scikit-learn

Pytorch ImageNet1k Loader with Bounding Boxes.