A TensorFlow implementation of FCN-8s

Last update: Aug 08, 2022

Overview

FCN-8s implementation in TensorFlow

Overview
Examples and demo video
Dependencies
How to use it
Download pre-trained VGG-16

Overview

This is a TensorFlow implementation of the FCN-8s model architecture for semantic image segmentation introduced by Shelhamer et al. in the paper Fully Convolutional Networks for Semantic Segmentation.

This repository only contains the 'all-at-once' version of the FCN-8s model, which converges significantly faster than the version trained in stages. A convolutionalized VGG-16 model trained on ImageNet classification is provided and serves as the encoder of the FCN-8s. Sufficient documentation and a tutorial on how to train, evaluate and use the model for prediction are also provided. Some useful TensorBoard summaries can be recorded out of the box.

Examples and demo video

Below are some prediction examples of the model trained on the Cityscapes dataset for 13,000 steps at batch size 16, at which point the model achieves a mean IoU of 38.2% on the validation dataset. This is far from convergence of course, the purpose of these examples is just to demonstrate that the code works and the model learns. You can watch the model in action on the Cityscapes demo videos here.

Dependencies

Python 3.x
TensorFlow 1.x
Numpy
Scipy
OpenCV (for data augmentation)
tqdm

How to use it

fcn8s_tutorial.ipynb explains how to train and evaluate the model and how to make and visualize predictions.

Download pre-trained VGG-16

You can download the pre-trained, convolutionalized VGG-16 model here

A TensorFlow implementation of FCN-8s

Related tags

Overview

FCN-8s implementation in TensorFlow

Contents

Overview

Examples and demo video

Dependencies

How to use it

Download pre-trained VGG-16

Owner

Pierluigi Ferrari

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Gesture Volume Control Using OpenCV and MediaPipe

Llvlir - Low Level Variable Length Intermediate Representation

The implementation of the lifelong infinite mixture model

Training and Evaluation Code for Neural Volumes

AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis.

Rotation-Only Bundle Adjustment

Ejemplo Algoritmo Viterbi - Example of a Viterbi algorithm applied to a hidden Markov model on DNA sequence

Data Consistency for Magnetic Resonance Imaging

Road Crack Detection Using Deep Learning Methods

Efficient Householder transformation in PyTorch

Learning Skeletal Articulations with Neural Blend Shapes

A curated list of awesome resources combining Transformers with Neural Architecture Search

A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

Project repo for the paper SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

Seq2seq - Sequence to Sequence Learning with Keras

High level network definitions with pre-trained weights in TensorFlow

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces