Neural style transfer in PyTorch.

Overview

style-transfer-pytorch

An implementation of neural style transfer (A Neural Algorithm of Artistic Style) in PyTorch, supporting CPUs and Nvidia GPUs. It does automatic multi-scale (coarse-to-fine) stylization to produce high-quality high resolution stylizations, even up to print resolution if the GPUs have sufficient memory. If two GPUs are available, they can both be used to increase the maximum resolution. (Using two GPUs is not faster than using one.)

The algorithm has been modified from that in the literature by:

  • Using the PyTorch pre-trained VGG-19 weights instead of the original VGG-19 weights

  • Changing the padding mode of the first layer of VGG-19 to 'replicate', to reduce edge artifacts

  • When using average or L2 pooling, scaling the result by an empirically derived factor to ensure that the magnitude of the result stays the same on average (Gatys et al. (2015) did not do this)

  • Using an approximation to the MSE loss such that its gradient L1 norm is approximately 1 for content and style losses (in order to approximate the effects of gradient normalization, which produces better visual quality)

  • Normalizing the Gram matrices by the number of elements in each feature map channel rather than by the total number of elements (Johnson et al.) or not normalizing (Gatys et al. (2015))

  • Taking an exponential moving average over the iterates to reduce iterate noise (each new scale is initialized with the previous scale's averaged iterate)

  • Warm-starting the Adam optimizer with scaled-up versions of its first and second moment buffers at the beginning of each new scale, to prevent noise from being added to the iterates at the beginning of each scale

  • Using non-equal weights for the style layers to improve visual quality

  • Stylizing the image at progressively larger scales, each greater by a factor of sqrt(2) (this is improved from the multi-scale scheme given in Gatys et al. (2016))

Example outputs (click for the full-sized version)

Installation

Python 3.6+ is required.

PyTorch is required: follow their installation instructions before proceeding. If you do not have an Nvidia GPU, select None for CUDA. On Linux, you can find out your CUDA version using the nvidia-smi command. PyTorch packages for CUDA versions lower than yours will work, but select the highest you can.

To install style-transfer-pytorch, first clone the repository, then run the command:

pip install -e PATH_TO_REPO

This will install the style_transfer CLI tool. style_transfer uses a pre-trained VGG-19 model (Simonyan et al.), which is 548MB in size, and will download it when first run.

If you have a supported GPU and style_transfer is using the CPU, try using the argument --device cuda:0 to force it to try to use the first CUDA GPU. This should print an informative error message.

Basic usage

style_transfer CONTENT_IMAGE STYLE_IMAGE [STYLE_IMAGE ...] [-o OUTPUT_IMAGE]

Input images will be converted to sRGB when loaded, and output images have the sRGB colorspace. If the output image is a TIFF file, it will be written with 16 bits per channel. Alpha channels in the inputs will be ignored.

style_transfer has many optional arguments: run it with the --help argument to see a full list. Particularly notable ones include:

  • --web enables a simple web interface while the program is running that allows you to watch its progress. It runs on port 8080 by default, but you can change it with --port. If you just want to view the current image and refresh it manually, you can go to /image.

  • --devices manually sets the PyTorch device names. It can be set to cpu to force it to run on the CPU on a machine with a supported GPU, or to e.g. cuda:1 (zero indexed) to select the second CUDA GPU. Two GPUs can be specified, for instance --devices cuda:0 cuda:1. style_transfer will automatically use the first visible CUDA GPU, falling back to the CPU, if it is omitted.

  • -s (--end-scale) sets the maximum image dimension (height and width) of the output. A large image (e.g. 2896x2172) can take around fifteen minutes to generate on an RTX 3090 and will require nearly all of its 24GB of memory. Since both memory usage and runtime increase linearly in the number of pixels (quadratically in the value of the --end-scale parameter), users with less GPU memory or who do not want to wait very long are encouraged to use smaller resolutions. The default is 512.

  • -sw (--style-weights) specifies factors for the weighted average of multiple styles if there is more than one style image specified. These factors are automatically normalized to sum to 1. If omitted, the styles will be blended equally.

  • -cw (--content-weight) sets the degree to which features from the content image are included in the output image. The default is 0.015.

  • -tw (--tv-weight) sets the strength of the smoothness prior. The default is 2.

References

  1. L. Gatys, A. Ecker, M. Bethge (2015), "A Neural Algorithm of Artistic Style"

  2. L. Gatys, A. Ecker, M. Bethge, A. Hertzmann, E. Shechtman (2016), "Controlling Perceptual Factors in Neural Style Transfer"

  3. J. Johnson, A. Alahi, L. Fei-Fei (2016), "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

  4. A. Mahendran, A. Vedaldi (2014), "Understanding Deep Image Representations by Inverting Them"

  5. D. Kingma, J. Ba (2014), "Adam: A Method for Stochastic Optimization"

  6. K. Simonyan, A. Zisserman (2014), "Very Deep Convolutional Networks for Large-Scale Image Recognition"

Owner
Katherine Crowson
Katherine Crowson
RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

RRxIO - Robust Radar Visual/Thermal Inertial Odometry RRxIO offers robust and accurate state estimation even in challenging visual conditions. RRxIO c

Christopher Doer 64 Dec 29, 2022
This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

GANs for Fun Created because I can! GOAL The goal of this repo is to be freely used by ML devs to check the GAN performances without coding from scrat

Sagnik Roy 13 Jan 26, 2022
Basit bir burç modülü.

Bu modulu burclar hakkinda gundelik bir sekilde bilgi alin diye yaptim ve sizler icin kullanima sunuyorum. Modulun kullanimi asiri basit: Ornek Kullan

Special 17 Jun 08, 2022
Dynamic wallpaper generator.

Wiki • About • Installation About This project is a dynamic wallpaper changer. It waits untill you turn on the music, downloads album cover if it's po

3 Sep 18, 2021
Topic Modelling for Humans

gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ

RARE Technologies 13.8k Jan 03, 2023
A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Logical Neural Networks LNNs are a novel Neuro = symbolic framework designed to seamlessly provide key properties of both neural nets (learning) and s

International Business Machines 138 Dec 19, 2022
Fast, Attemptable Route Planner for Navigation in Known and Unknown Environments

FAR Planner uses a dynamically updated visibility graph for fast replanning. The planner models the environment with polygons and builds a global visi

Fan Yang 346 Dec 30, 2022
This repository contains implementations and illustrative code to accompany DeepMind publications

DeepMind Research This repository contains implementations and illustrative code to accompany DeepMind publications. Along with publishing papers to a

DeepMind 11.3k Dec 31, 2022
The implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets.

Joint t-sne This is the implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets. abstract: We present Jo

IDEAS Lab 7 Dec 18, 2022
MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

MAg Paper Abstract File structure Dataset prepare Data description How to use MAg? Why not try the MAg_lib! Trained models Experiment and results Some

Calvin Pang 3 Apr 08, 2022
Workshop Materials Delivered on 28/02/2022

intro-to-cnn-p1 Repo for hosting workshop materials delivered on 28/02/2022 Questions you will answer in this workshop Learning Objectives What are co

Beginners Machine Learning 5 Feb 28, 2022
High frequency AI based algorithmic trading module.

Flow Flow is a high frequency algorithmic trading module that uses machine learning to self regulate and self optimize for maximum return. The current

59 Dec 14, 2022
Code for Ditto: Building Digital Twins of Articulated Objects from Interaction

Ditto: Building Digital Twins of Articulated Objects from Interaction Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu CVPR 2022, Oral Project | arxiv News 2022

UT Robot Perception and Learning Lab 78 Dec 22, 2022
This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Stability Audit This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems, Humantic

Data, Responsibly 4 Oct 27, 2022
A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

Nate 60 Dec 20, 2022
Deep Learning Specialization by Andrew Ng, deeplearning.ai.

Deep Learning Specialization on Coursera Master Deep Learning, and Break into AI This is my personal projects for the course. The course covers deep l

Engen 1.5k Jan 07, 2023
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 05, 2023
Fast and Easy Infinite Neural Networks in Python

Neural Tangents ICLR 2020 Video | Paper | Quickstart | Install guide | Reference docs | Release notes Overview Neural Tangents is a high-level neural

Google 1.9k Jan 09, 2023
Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" (RSS 2022)

Intro Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" Robotics:Science and

Yunho Kim 21 Dec 07, 2022
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

data-services A repository for storing various Data Engineering docker-compose files in one place. How to use it ? Set the required settings in .env f

BigData.IR 525 Dec 03, 2022