Generative Adversarial Text to Image Synthesis

Overview

Text To Image Synthesis

This is a tensorflow implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow.

Plese star https://github.com/tensorlayer/tensorlayer

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Datasets

  • The model is currently trained on the flowers dataset. Download the images from here and save them in 102flowers/102flowers/*.jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in 102flowers/text_c10/class_*.

N.B You can downloads all data files needed manually or simply run the downloads.py and put the correct files to the right directories.

python downloads.py

Codes

  • downloads.py download Oxford-102 flower dataset and caption files(run this first).
  • data_loader.py load data for further processing.
  • train_txt2im.py train a text to image model.
  • utils.py helper functions.
  • model.py models.

References

Results

  • the flower shown has yellow anther red pistil and bright red petals.
  • this flower has petals that are yellow, white and purple and has dark lines
  • the petals on this flower are white with a yellow center
  • this flower has a lot of small round pink petals.
  • this flower is orange in color, and has petals that are ruffled and rounded.
  • the flower has yellow petals and the center of it is brown
  • this flower has petals that are blue and white.
  • these white flowers have petals that start off white in color and end in a white towards the tips.

License

Apache 2.0

Comments
  • ValueError: Object arrays cannot be loaded when allow_pickle=False

    ValueError: Object arrays cannot be loaded when allow_pickle=False

    File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "/home/siddanath/importantforprojects/text-to-image/utils.py", line 20, in load_and_assign_npz params = tl.files.load_npz(name=name) File "/home/siddanath/importantforprojects/text-to-image/tensorlayer/files.py", line 600, in load_npz return d['params'] File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 262, in getitem pickle_kwargs=self.pickle_kwargs) File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py", line 722, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

    opened by Siddanth-pai 2
  • Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    I got a problem, how can I solve it?

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnnftxt/rnn/dynamic/rnn/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

    opened by flsd201983 1
  • Next step after download.py

    Next step after download.py

    What is the next step to do after download.py? I tried python data_loader.py, but it has FileNotFoundError: FileNotFoundError: [Errno 2] No such file or directory: '/home/ly/src/lib/text-to-image/102flowers/text_c10'

    opened by arisliang 0
  • ValueError: invalid literal for int() with base 10: 'e' - when making inference

    ValueError: invalid literal for int() with base 10: 'e' - when making inference

    code -

    sample_sentence = ["a"] * int(sample_size/ni) + ["e"] * int(sample_size/ni) + ["i"] * int(sample_size/ni) + ["o"] * int(sample_size/ni) + ["u"] * int(sample_size/ni)

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize( sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')
    
    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={
        t_real_caption: sample_sentence,
        t_z: sample_seed})
    
    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')
    
    opened by Akinleyejoshua 0
  • Excuse me, why is the flower dataset I test the result is very different from result.png

    Excuse me, why is the flower dataset I test the result is very different from result.png

    import tensorflow as tf import tensorlayer as tl from tensorlayer.layers import * from tensorlayer.prepro import * from tensorlayer.cost import * import numpy as np import scipy from scipy.io import loadmat import time, os, re, nltk

    from utils import * from model import * import model import pickle

    ###======================== PREPARE DATA ====================================### print("Loading data from pickle ...") import pickle with open("_vocab.pickle", 'rb') as f: vocab = pickle.load(f) with open("_image_train.pickle", 'rb') as f: _, images_train = pickle.load(f) with open("_image_test.pickle", 'rb') as f: _, images_test = pickle.load(f) with open("_n.pickle", 'rb') as f: n_captions_train, n_captions_test, n_captions_per_image, n_images_train, n_images_test = pickle.load(f) with open("_caption.pickle", 'rb') as f: captions_ids_train, captions_ids_test = pickle.load(f)

    images_train_256 = np.array(images_train_256)

    images_test_256 = np.array(images_test_256)

    images_train = np.array(images_train) images_test = np.array(images_test)

    ni = int(np.ceil(np.sqrt(batch_size))) save_dir = "checkpoint"

    t_real_image = tf.placeholder('float32', [batch_size, image_size, image_size, 3], name = 'real_image')

    t_real_caption = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name='real_caption_input')

    t_z = tf.placeholder(tf.float32, [batch_size, z_dim], name='z_noise') generator_txt2img = model.generator_txt2img_resnet

    net_rnn = rnn_embed(t_real_caption, is_train=False, reuse=False) net_g, _ = generator_txt2img(t_z, net_rnn.outputs, is_train=False, reuse=False, batch_size=batch_size)

    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) tl.layers.initialize_global_variables(sess)

    net_rnn_name = os.path.join(save_dir, 'net_rnn.npz400.npz') net_cnn_name = os.path.join(save_dir, 'net_cnn.npz400.npz') net_g_name = os.path.join(save_dir, 'net_g.npz400.npz') net_d_name = os.path.join(save_dir, 'net_d.npz400.npz')

    net_rnn_res = tl.files.load_and_assign_npz(sess=sess, name=net_rnn_name, network=net_rnn)

    net_g_res = tl.files.load_and_assign_npz(sess=sess, name=net_g_name, network=net_g)

    sample_size = batch_size sample_seed = np.random.normal(loc=0.0, scale=1.0, size=(sample_size, z_dim)).astype(np.float32)

    n = int(sample_size / ni) sample_sentence = ["the flower shown has yellow anther red pistil and bright red petals."] * n +
    ["this flower has petals that are yellow, white and purple and has dark lines"] * n +
    ["the petals on this flower are white with a yellow center"] * n +
    ["this flower has a lot of small round pink petals."] * n +
    ["this flower is orange in color, and has petals that are ruffled and rounded."] * n +
    ["the flower has yellow petals and the center of it is brown."] * n +
    ["this flower has petals that are blue and white."] * n +
    ["these white flowers have petals that start off white in color and end in a white towards the tips."] * n

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize(sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')

    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={ t_real_caption : sample_sentence, t_z : sample_seed})

    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')

    opened by keqkeq 0
  • Tensorflow 2.1, Tensorlayer 2.2 update

    Tensorflow 2.1, Tensorlayer 2.2 update

    Hello,

    are there any plans in the near future to update this git to the latest Tensorflow and Tensorlayer versions? I've been trying making the code run with backwards compat (compat.tf1. ...) but I've keep bumping on errors which are a bit too big of mouth full for me.

    Fyi: I've succesfully run the DCGAN Tensorlayer implementation with Tensorlayer 2.2 and a self build Tensorflow 2.1 (with 3.0 compute compatibility) from source in Python 3.7.

    So, an update would be greatly appreciated!

    opened by SadRebel1000 0
Releases(0.2)
Owner
Hao
Assistant Professor @ Peking University
Hao
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
M3DSSD: Monocular 3D Single Stage Object Detector

M3DSSD: Monocular 3D Single Stage Object Detector Setup pytorch 0.4.1 Preparation Download the full KITTI detection dataset. Then place a softlink (or

mumianyuxin 64 Dec 27, 2022
Springer Link Download Module for Python

♞ pupalink A simple Python module to search and download books from SpringerLink. 🧪 This project is still in an early stage of development. Expect br

Pupa Corp. 18 Nov 21, 2022
Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation We present a generic image-to-image translation framework, pixel2style2pixel (pSp

2.8k Dec 30, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
Machine Learning Model deployment for Container (TensorFlow Serving)

try_tf_serving ├───dataset │ ├───testing │ │ ├───paper │ │ ├───rock │ │ └───scissors │ └───training │ ├───paper │ ├───rock

Azhar Rizki Zulma 5 Jan 07, 2022
pytorch implementation of fast-neural-style

fast-neural-style 🌇 🚀 NOTICE: This codebase is no longer maintained, please use the codebase from pytorch examples repository available at pytorch/e

Abhishek Kadian 405 Dec 15, 2022
Get started with Machine Learning with Python - An introduction with Python programming examples

Machine Learning With Python Get started with Machine Learning with Python An engaging introduction to Machine Learning with Python TL;DR Download all

Learn Python with Rune 130 Jan 02, 2023
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
Sentiment analysis translations of the Bhagavad Gita

Sentiment and Semantic Analysis of Bhagavad Gita Translations It is well known that translations of songs and poems not only breaks rhythm and rhyming

Machine learning and Bayesian inference @ UNSW Sydney 3 Aug 01, 2022
Code for NeurIPS2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints"

This repository is the code for NeurIPS 2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints". Edit 2021/

10 Dec 20, 2022
Understanding Convolution for Semantic Segmentation

TuSimple-DUC by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. Introduction This repository is for Under

TuSimple 585 Dec 31, 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes This repository contains the source code accompanying the paper: FlexConv: C

Robert-Jan Bruintjes 96 Dec 12, 2022
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
All supplementary material used by me while TA-ing CS3244: Machine Learning

CS3244-Tutorial-Material All supplementary material used by me while TA-ing CS3244: Machine Learning at NUS School of Computing. What is this? I teach

Rishabh Anand 18 Sep 23, 2022
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

Yutian Liu 2 Jan 29, 2022
[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Guangrui Li 84 Dec 26, 2022
Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks.

Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks. Generally, we intergrete different kind of functional

28 Jan 08, 2023
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
This project is the PyTorch implementation of our CVPR 2022 paper:

Requirements and Dependency Install PyTorch with CUDA (for GPU). (Experiments are validated on python 3.8.11 and pytorch 1.7.0) (For visualization if

Lei Huang 23 Nov 29, 2022