Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Related tags

Deep LearningGOCor
Overview

Official implementation of GOCor

This is the official implementation of our paper :

GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network.
Authors: Prune Truong *, Martin Danelljan *, Luc Van Gool, Radu Timofte

[Paper][Website][Video]

The feature correlation layer serves as a key neural network module in numerous computer vision problems that involve dense correspondences between image pairs. It predicts a correspondence volume by evaluating dense scalar products between feature vectors extracted from pairs of locations in two images. However, this point-to-point feature comparison is insufficient when disambiguating multiple similar regions in an image, severely affecting the performance of the end task. This work proposes GOCor, a fully differentiable dense matching module, acting as a direct replacement to the feature correlation layer. The correspondence volume generated by our module is the result of an internal optimization procedure that explicitly accounts for similar regions in the scene. Moreover, our approach is capable of effectively learning spatial matching priors to resolve further matching ambiguities.

alt text

Also check out our related work GLU-Net and the code here !


In this repo, we only provide code to test on image pairs as well as the pre-trained weights of the networks evaluated in GOCor paper. We will not release the training code. However, since GOCor module is a plug-in replacement for the feature correlation layer, it can be integrated into any architecture and trained using the original training code. We will release general training and evaluation code in a general dense correspondence repo, coming soon here.


For any questions, issues or recommendations, please contact Prune at [email protected]

Citation

If our project is helpful for your research, please consider citing :

@inproceedings{GOCor_Truong_2020,
      title = {{GOCor}: Bringing Globally Optimized Correspondence Volumes into Your Neural Network},
      author    = {Prune Truong 
                   and Martin Danelljan 
                   and Luc Van Gool 
                   and Radu Timofte},
      year = {2020},
      booktitle = {Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information
                   Processing Systems 2020, {NeurIPS} 2020}
}

1. Installation

Note that the models were trained with torch 1.0. Torch versions up to 1.7 were tested for inference but NOT for training, so I cannot guarantee that the models train smoothly for higher torch versions.

  • Create and activate conda environment with Python 3.x
conda create -n GOCor_env python=3.7
conda activate GOCor_env
  • Install all dependencies (except for cupy, see below) by running the following command:
pip install -r requirements.txt

Note: CUDA is required to run the code. Indeed, the correlation layer is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository. The code was developed using Python 3.7 & PyTorch 1.0 & CUDA 9.0, which is why I installed cupy for cuda90. For another CUDA version, change accordingly.

pip install cupy-cuda90==7.8.0 --no-cache-dir 

There are some issues with latest versions of cupy. So for all cuda, install cupy version 7.8.0. For example, on cuda10,

pip install cupy-cuda100==7.8.0 --no-cache-dir 
  • Download an archive with pre-trained models click and extract it to the project folder

2. Models

Pre-trained weights can be downloaded from here. We provide the pre-trained weights of:

  • GLU-Net trained on the static data, these are given for reference, they correspond to the weights 'GLUNet_DPED_CityScape_ADE.pth' that we provided here
  • GLU-Net-GOCor trained on the static data, corresponds to network in the GOCor paper
  • GLU-Net trained on the dynamic data
  • GLU-Net-GOCor trained on the dynamic data, corresponds to network in the GOCor paper
  • PWC-Net finetuned on chairs-things (by us), they are given for reference
  • PWC-Net-GOCor finetuned on chair-things, corresponds to network in the GOCor paper
  • PWC-Net further finetuned on sintel (by us), for reference
  • PWC-Net-GOCor further finetuned on sintel, corresponds to network in the GOCor paper

For reference, you can also use the weights from the original PWC-Net repo, where the networks are trained on chairs-things and further finetuned on sintel. As explained in the paper, for training our PWC-Net-based models, we initialize the network parameters with the pre-trained weights trained on chairs-things.

All networks are created in 'model_selection.py'

3. Test on your own images

You can test the networks on a pair of images using test_models.py and the provided trained model weights. You must first choose the model and pre-trained weights to use. The inputs are the paths to the query and reference images. The images are then passed to the network which outputs the corresponding flow field relating the reference to the query image. The query is then warped according to the estimated flow, and a figure is saved.

For this pair of images (provided to check that the code is working properly) and using GLU-Net-GOCor trained on the dynamic dataset, the output is:

python test_models.py --model GLUNet_GOCor --pre_trained_model dynamic --path_query_image images/eth3d_query.png --path_reference_image images/eth3d_reference.png --write_dir evaluation/

additional optional arguments:
--pre_trained_models_dir (default is pre_trained_models/)

alt text

For baseline GLU-Net, the output is instead:

python test_models.py --model GLUNet --pre_trained_model dynamic --path_query_image images/eth3d_query.png --path_reference_image images/eth3d_reference.png --write_dir evaluation/

alt text

And for PWC-Net-GOCor and baseline PWC-Net:

python test_models.py --model PWCNet_GOCor --pre_trained_model chairs_things --path_query_image images/kitti2015_query.png --path_reference_image images/kitti2015_reference.png --write_dir evaluation/

alt text

python test_models.py --model PWCNet --pre_trained_model chairs_things --path_query_image images/kitti2015_query.png --path_reference_image images/kitti2015_reference.png --write_dir evaluation/

alt text


Possible model choices are : GLUNet, GLUNet_GOCor, PWCNet, PWCNet_GOCor

Possible pre-trained model choices are: static, dynamic, chairs_things, chairs_things_ft_sintel

4. Acknowledgement

We borrow code from public projects, such as pytracking, GLU-Net, DGC-Net, PWC-Net, NC-Net, Flow-Net-Pytorch, RAFT ...

Owner
Prune Truong
PhD Student in Computer Vision Lab of ETH Zurich
Prune Truong
Codebase for testing whether hidden states of neural networks encode discrete structures.

structural-probes Codebase for testing whether hidden states of neural networks encode discrete structures. Based on the paper A Structural Probe for

John Hewitt 349 Dec 17, 2022
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation (CVPR'21)

EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation (CVPR'21) Citation If y

addisonwang 18 Nov 11, 2022
Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

Hire-Wave-MLP.pytorch Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP Resul

Nevermore 29 Oct 28, 2022
use tensorflow 2.0 to tell a dog and cat from a specified picture

dog_or_cat use tensorflow 2.0 to tell a dog and cat from a specified picture This is one of the classic experiments for the introduction of deep learn

你这个代码我看不懂 1 Oct 22, 2021
Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Learning to Execute (L2E) Official code base for completely reproducing all results reported in I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learnin

3 May 18, 2022
A Keras implementation of CapsNet in the paper: Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules

NOTE This implementation is fork of https://github.com/XifengGuo/CapsNet-Keras , applied to IMDB texts reviews dataset. CapsNet-Keras A Keras implemen

Lauro Moraes 5 Oct 23, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
Open CV - Convert a picture to look like a cartoon sketch in python

Use the video https://www.youtube.com/watch?v=k7cVPGpnels for initial learning.

Sammith S Bharadwaj 3 Jan 29, 2022
Anagram Generator in Python

Anagrams Generator This is a program for computing multiword anagrams. It makes no effort to come up with sentences that make sense; it only finds ana

Day Fundora 5 Nov 17, 2022
A universal memory dumper using Frida

Fridump Fridump (v0.1) is an open source memory dumping tool, primarily aimed to penetration testers and developers. Fridump is using the Frida framew

551 Jan 07, 2023
A project which aims to protect your privacy using inexpensive hardware and easily modifiable software

Protecting your privacy using an ESP32, an IR sensor and a python script This project, which I personally call the "never-gonna-catch-me-in-the-act-ev

8 Oct 10, 2022
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
This is a Image aid classification software based on python TK library development

This is a Image aid classification software based on python TK library development.

EasonChan 1 Jan 17, 2022
A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not

Stephen Stratoti 9 Dec 27, 2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

SAT: 2D Semantics Assisted Training for 3D Visual Grounding SAT: 2D Semantics Assisted Training for 3D Visual Grounding by Zhengyuan Yang, Songyang Zh

Zhengyuan Yang 22 Nov 30, 2022
This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

SPARQLing Database Queries from Intermediate Question Decompositions This repo is the implementation of the following paper: SPARQLing Database Querie

Yandex Research 20 Dec 19, 2022
Learning To Have An Ear For Face Super-Resolution

Learning To Have An Ear For Face Super-Resolution [Project Page] This repository contains demo code of our CVPR2020 paper. Training and evaluation on

50 Nov 16, 2022
Cockpit is a visual and statistical debugger specifically designed for deep learning.

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Felix Dangel 421 Dec 29, 2022
Repository for the AugmentedPCA Python package.

Overview This Python package provides implementations of Augmented Principal Component Analysis (AugmentedPCA) - a family of linear factor models that

Billy Carson 6 Dec 07, 2022