WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Last update: Oct 28, 2022

Related tags

Overview

Paper "Improving image captioning with better use of captions"

@inproceedings{shi2020improving,
  title={Improving Image Captioning with Better Use of Caption},
  author={Shi, Zhan and Zhou, Xu and Qiu, Xipeng and Zhu, Xiaodan},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={7454--7464},
  year={2020}
}

Requirements

python 2.7.15

torch 1.0.1

Specific conda env is shown in ezs.yml

BTW, you need to download coco-captions and cider folder in this directory for evaluation.

Data Files and Models

Files: Add files in data directory in google drive or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) to data directory here. See data/README for more details.

Models: Add log directory in google drive or or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) here.

Scripts

MLE training:

python train.py --gpus 0 --id experiment-mle

RL training

python train.py --gpus 0 --id experiment-rl --learning_rate 2e-5 --resume_from experiment-mle --resume_from_best True --self_critical_after 0 --max_epochs 60 --learning_rate_decay_start -1 --scheduled_sampling_start -1 --reduce_on_plateau

Evaluate your own model or Load trained model:

python eval.py --gpus 0 --resume_from experiment-mle

and

python eval.py --gpus 0 --resume_from experiment-rl

Acknowledgement

This code is based on Ruotian Luo's brilliant image captioning repo ruotianluo/self-critical.pytorch. We use the detected bounding boxes/categories/features provided by Bottom-Up peteanderson80/bottom-up-attention, yangxuntu/SGAE. Many thanks for their work!

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Related tags

Overview

Paper "Improving image captioning with better use of captions"

Requirements

Data Files and Models

Scripts

Acknowledgement

Owner

The implementation for the SportsCap (IJCV 2021)

A PaddlePaddle version image model zoo.

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Rl-quickstart - Reinforcement Learning Quickstart

[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

Predicting Event Memorability from Contextual Visual Semantics

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

A PyTorch Implementation of SphereFace.

PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

Scaling and Benchmarking Self-Supervised Visual Representation Learning