A Tensorflow implementation of BicycleGAN.

Last update: Dec 02, 2022

Related tags

Overview

BicycleGAN implementation in Tensorflow

As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometimes delay) research in the AI community by promoting open-source projects. To this end, we implement state-of-the-art research papers, and publicly share them with concise reports. Please visit our group github site for other projects.

This project is implemented by Youngwoon Lee and the codes have been reviewed by Yuan-Hong Liao before being published.

Description

This repo is a Tensorflow implementation of BicycleGAN on Pix2Pix datasets: Toward Multimodal Image-to-Image Translation.

This paper presents a framework addressing the image-to-image translation task, where we are interested in converting an image from one domain (e.g., sketch) to another domain (e.g., image). While the previous method (pix2pix) cannot generate diverse outputs, this paper proposes a method that one image (e.g., a sketch of shoes) can be transformed into a set of images (e.g., shoes with different colors/textures).

The proposed method encourages diverse results by generating output images with noise and then reconstructing noise from the output images. The framework consists of two cycles, B -> z' -> B' and noise z -> output B' -> noise z'.

The first step is the conditional Variational Auto Encoder GAN (cVAE-GAN) whose architecture is similar to pix2pix network with noise. In cVAE-GAN, a generator G takes an input image A (sketch) and a noise z and outputs its counterpart in domain B (image) with variations. However, it was reported that the generator G ends up with ignoring the added noise.

The second part, the conditional Latent Regressor GAN (cLR-GAN), enforces the generator to follow the noise z. An encoder E maps visual features (color and texture) of a generated image B' to the latent vector z' which is close to the original noise z. To minimize |z-z'|, images computed with different noises should be different. Therefore, the cLR-GAN can alleviate the issue of mode collapse. Moreover, a KL-divergence loss KL(p(z);N(0;I)) encourages the latent vectors to follow gaussian distribution, so a gaussian noise can be used as a latent vector in testing time.

Finally, the total loss term for Bi-Cycle-GAN is:

Dependencies

Ubuntu 16.04
Python 2.7
Tensorflow 1.1.0
NumPy
SciPy
Pillow
tqdm
h5py

Usage

Execute the following command to download the specified dataset as well as train a model:

$ python bicycle-gan.py --task edges2shoes --image_size 256

To reconstruct 256x256 images, set --image_size to 256; otherwise it will resize to and generate images in 128x128. Once training is ended, testing images will be converted to the target domain and the results will be saved to ./results/edges2shoes_2017-07-07_07-07-07/.
Available datasets: edges2shoes, edges2handbags, maps, cityscapes, facades
Check the training status on Tensorboard:

$ tensorboard --logdir=./logs

Results

edges2shoes

Linearly sampled noise	Randomly sampled noise

day2night

In-progress

A Tensorflow implementation of BicycleGAN.

Related tags

Overview

BicycleGAN implementation in Tensorflow

Description

Dependencies

Usage

Results

edges2shoes

day2night

References

Owner

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC

ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

An official implementation of MobileStyleGAN in PyTorch

Visualizing Yolov5's layers using GradCam

Convert Apple NeuralHash model for CSAM Detection to ONNX.

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

NNR conformation conditional and global probabilities estimation and analysis in peptides or proteins fragments

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

Classification of ecg datas for disease detection

code for our BMVC 2021 paper "HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification"

Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Export CenterPoint PonintPillars ONNX Model For TensorRT

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration