Official implementation of EfficientPose

Overview

EfficientPose

This is the official implementation of EfficientPose. We based our work on the Keras EfficientDet implementation xuannianz/EfficientDet which again builds up on the great Keras RetinaNet implementation fizyr/keras-retinanet, the official EfficientDet implementation google/automl and qubvel/efficientnet.

image1

Installation

  1. Clone this repository
  2. Create a new environment with conda create -n EfficientPose python==3.6
  3. Activate that environment with conda activate EfficientPose
  4. Install Tensorflow 1.15.0 with conda install tensorflow-gpu==1.15.0
  5. Go to the repo dir and install the other dependencys using pip install -r requirements.txt
  6. Compile cython modules with python setup.py build_ext --inplace

Dataset and pretrained weights

You can download the Linemod and Occlusion datasets and the pretrained weights from here. Just unzip the Linemod_and_Occlusion.zip file and you can train or evaluate using these datasets as described below.

The dataset is originally downloaded from j96w/DenseFusion as well as chensong1995/HybridPose and were preprocessed using the generate_masks.py script. The EfficientDet COCO pretrained weights are from xuannianz/EfficientDet.

Training

Linemod

To train a phi = 0 EfficientPose model on object 8 of Linemod (driller) using COCO pretrained weights:

python train.py --phi 0 --weights /path_to_weights/file.h5 linemod /path_to_dataset/Linemod_preprocessed/ --object-id 8

Occlusion

To train a phi = 0 EfficientPose model on Occlusion using COCO pretrained weights:

python train.py --phi 0 --weights /path_to_weights/file.h5 occlusion /path_to_dataset/Linemod_preprocessed/

See train.py for more arguments.

Evaluating

Linemod

To evaluate a trained phi = 0 EfficientPose model on object 8 of Linemod (driller) and (optionally) save the predicted images:

python evaluate.py --phi 0 --weights /path_to_weights/file.h5 --validation-image-save-path /where_to_save_predicted_images/ linemod /path_to_dataset/Linemod_preprocessed/ --object-id 8

Occlusion

To evaluate a trained phi = 0 EfficientPose model on Occlusion and (optionally) save the predicted images:

python evaluate.py --phi 0 --weights /path_to_weights/file.h5 --validation-image-save-path /where_to_save_predicted_images/ occlusion /path_to_dataset/Linemod_preprocessed/

If you don`t want to save the predicted images just skip the --validation-image-save-path argument.

Inferencing

We also provide two basic scripts demonstrating the exemplary use of a trained EfficientPose model for inferencing. With python inference.py you can run EfficientPose on all images in a directory. The needed parameters, e.g. the path to the images and the model can be modified in the inference.py script.

With python inference_webcam.py you can run EfficientPose live with your webcam. Please note that you have to replace the intrinsic camera parameters used in this script (Linemod) with your webcam parameters. Since the Linemod and Occlusion datasets are too small to expect a reasonable 6D pose estimation performance in the real world and a lot of people probably do not have the exact same objects used in Linemod (like me), you can try to display a Linemod image on your screen and film it with your webcam.

Benchmark

To measure the runtime of EfficientPose on your machine you can use python benchmark_runtime.py. The needed parameters, e.g. the path to the model can be modified in the benchmark_runtime.py script. Similarly, you can also measure the vanilla EfficientDet runtime on your machine with the benchmark_runtime_vanilla_effdet.py script.

Debugging Dataset and Generator

If you want to modify the generators or build a new custom dataset, it can be very helpful to display the dataset annotations loaded from your generator to make sure everything works as expected. With

python debug.py --phi 0 --annotations linemod /path_to_dataset/Linemod_preprocessed/ --object-id 8

you can display the loaded and augmented image as well as annotations prepared for a phi = 0 model from object 8 of the Linemod dataset. Please see debug.py for more arguments.

Citation

Please cite EfficientPose if you use it in your research

@misc{bukschat2020efficientpose,
      title={EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach}, 
      author={Yannick Bukschat and Marcus Vetter},
      year={2020},
      eprint={2011.04307},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

EfficientPose is licensed under the Creative Commons Attribution-NonCommercial 4.0 International license and is freely available for non-commercial use. Please see the LICENSE for further details. If you are interested in commercial use, please contact us under [email protected] or [email protected].

Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Sundararaman 76 Dec 06, 2022
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Ibai Gorordo 42 Oct 07, 2022
DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

DexterRedTool Author: Dexter Delandro CSEC 473 - Spring 2022 This tool persisten

2 Feb 16, 2022
Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Marko Jocić 922 Dec 19, 2022
Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a

NASA Jet Propulsion Laboratory 8 Nov 06, 2022
This code is an unofficial implementation of HiFiSinger.

HiFiSinger This code is an unofficial implementation of HiFiSinger. The algorithm is based on the following papers: Chen, J., Tan, X., Luan, J., Qin,

Heejo You 87 Dec 23, 2022
[CVPR 2021] MiVOS - Scribble to Mask module

MiVOS (CVPR 2021) - Scribble To Mask Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] A simplistic network that turns scri

Rex Cheng 65 Dec 22, 2022
PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

ExORL: Exploratory Data for Offline Reinforcement Learning This is an original PyTorch implementation of the ExORL framework from Don't Change the Alg

Denis Yarats 52 Jan 01, 2023
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

Mikaela Uy 294 Dec 12, 2022
Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

Semantic Code Search Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project. The model

Chen Wu 24 Nov 29, 2022
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

OpenMMLab 2.7k Jan 07, 2023
Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Rock Piano "When all is one and one is all, that's what it is to be a rock and not to roll." ---Led Zeppelin, "Stairway To Heaven" Proof-Of-Concept Pi

Alex 4 Nov 28, 2021
Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Query Variation Generators This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelin

Gustavo Penha 12 Nov 20, 2022
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
Tensorflow implementation of soft-attention mechanism for video caption generation.

SA-tensorflow Tensorflow implementation of soft-attention mechanism for video caption generation. An example of soft-attention mechanism. The attentio

Paul Chen 153 Nov 14, 2022
Code to replicate the key results from Exploring the Limits of Out-of-Distribution Detection

Exploring the Limits of Out-of-Distribution Detection In this repository we're collecting replications for the key experiments in the Exploring the Li

Stanislav Fort 35 Jan 03, 2023
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023