This repository will contain the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

Last update: Dec 27, 2022

Related tags

Text Data & NLP giraffe

Overview

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

Project Page | Paper | Supplementary | Video | Slides | Blog | Talk

If you find our code or paper useful, please cite as

@inproceedings{GIRAFFE,
    title = {GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields},
    author = {Niemeyer, Michael and Geiger, Andreas},
    booktitle = {Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
    year = {2021}
}

TL; DR - Quick Start

First you have to make sure that you have all dependencies in place. The simplest way to do so, is to use anaconda.

You can create an anaconda environment called giraffe using

conda env create -f environment.yml
conda activate giraffe

You can now test our code on the provided pre-trained models. For example, simply run

python render.py configs/256res/cars_256_pretrained.yaml

This script should create a model output folder out/cars256_pretrained. The animations are then saved to the respective subfolders in out/cars256_pretrained/rendering.

Usage

Datasets

To train a model from scratch or to use our ground truth activations for evaluation, you have to download the respective dataset.

For this, please run

bash scripts/download_data.sh

and following the instructions. This script should download and unpack the data automatically into the data/ folder.

Controllable Image Synthesis

To render images of a trained model, run

python render.py CONFIG.yaml

where you replace CONFIG.yaml with the correct config file. The easiest way is to use a pre-trained model. You can do this by using one of the config files which are indicated with *_pretrained.yaml.

For example, for our model trained on Cars at 256x256 pixels, run

python render.py configs/256res/cars_256_pretrained.yaml

or for celebA-HQ at 256x256 pixels, run

python render.py configs/256res/celebahq_256_pretrained.yaml

Our script will automatically download the model checkpoints and render images. You can find the outputs in the out/*_pretrained folders.

Please note that the config files *_pretrained.yaml are only for evaluation or rendering, not for training new models: when these configs are used for training, the model will be trained from scratch, but during inference our code will still use the pre-trained model.

FID Evaluation

For evaluation of the models, we provide the script eval.py. You can run it using

python eval.py CONFIG.yaml

The script generates 20000 images and calculates the FID score.

Note: For some experiments, the numbers in the paper might slightly differ because we used the evaluation protocol from GRAF to fairly compare against the methods reported in GRAF.

Training

Finally, to train a new network from scratch, run

python train.py CONFIG.yaml

where you replace CONFIG.yaml with the name of the configuration file you want to use.

You can monitor on http://localhost:6006 the training process using tensorboard:

cd OUTPUT_DIR
tensorboard --logdir ./logs

where you replace OUTPUT_DIR with the respective output directory. For available training options, please take a look at configs/default.yaml.

2D-GAN Baseline

For convinience, we have implemented a 2D-GAN baseline which closely follows this GAN_stability repo. For example, you can train a 2D-GAN on CompCars at 64x64 pixels similar to our GIRAFFE method by running

python train.py configs/64res/cars_64_2dgan.yaml

Using Your Own Dataset

If you want to train a model on a new dataset, you first need to generate ground truth activations for the intermediate or final FID calculations. For this, you can use the script in scripts/calc_fid/precalc_fid.py. For example, if you want to generate an FID file for the comprehensive cars dataset at 64x64 pixels, you need to run

python scripts/precalc_fid.py  "data/comprehensive_cars/images/*.jpg" --regex True --gpu 0 --out-file "data/comprehensive_cars/fid_files/comprehensiveCars_64.npz" --img-size 64

or for LSUN churches, you need to run

python scripts/precalc_fid.py path/to/LSUN --class-name scene_categories/church_outdoor_train_lmdb --lsun True --gpu 0 --out-file data/church/fid_files/church_64.npz --img-size 64

Note: We apply the same transformations to the ground truth images for this FID calculation as we do during training. If you want to use your own dataset, you need to adjust the image transformations in the script accordingly. Further, you might need to adjust the object-level and camera transformations to your dataset.

Evaluating Generated Images

We provide the script eval_files.py for evaluating the FID score of your own generated images. For example, if you would like to evaluate your images on CompCars at 64x64 pixels, save them to an npy file and run

python eval_files.py --input-file "path/to/your/images.npy" --gt-file "data/comprehensive_cars/fid_files/comprehensiveCars_64.npz"

Futher Information

More Work on Implicit Representations

If you like the GIRAFFE project, please check out related works on neural representions from our group:

This repository will contain the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

Related tags

Overview

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

Project Page | Paper | Supplementary | Video | Slides | Blog | Talk

TL; DR - Quick Start

Usage

Datasets

Controllable Image Synthesis

FID Evaluation

Training

2D-GAN Baseline

Using Your Own Dataset

Evaluating Generated Images

Futher Information

More Work on Implicit Representations

Owner

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

✨Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.

Material for GW4SHM workshop, 16/03/2022.

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

leaking paid token generator that was a shit lmao for 100$ haha

Tool which allow you to detect and translate text.

Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Train and use generative text models in a few lines of code.

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Code for the paper "Flexible Generation of Natural Language Deductions"

Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

Easy, fast, effective, and automatic g-code compression!

Reading Wikipedia to Answer Open-Domain Questions

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo