An easy-to-use app to visualise attentions of various VQA models.

Last update: Nov 13, 2022

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

An easy-to-use app to visualise attentions of various VQA models. Please click here to see a live demo of the app!

• Models
• Requirements
• Installation
• How to run
• How to use
• Contributing
• Acknowledgements

Models

• MFB - Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu, Jun Yu, Jianping Fan, Dacheng Tao
Arxiv

• (Coming soon) MCAN - Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Arvix

Requirements

Please check the requirements.txt file for the version numbers.

opencv_python==4.4.0.46
numpy==1.19.4
pandas==1.1.4
torch==1.4.0
matplotlib==3.3.2
gdown==3.12.2
seaborn==0.11.0
dotmap==1.3.23
streamlit==0.70.0
Pillow==8.0.1
PyYAML==5.3.1

Installation

Install Anaconda
Clone this repository and cd into it.
git clone https://github.com/apugoneappu/ask_me_anything.git && cd ask_me_anything
In a new environment (new_env)
pip install -r requirements.txt

How to run

From the directory of this repository, do the following -

conda activate new_env
streamlit run main.py
In a browser tab, open the Network URL displayed in your terminal.

Done! 🎉

How to use

Contributing

First of all, thank you for wanting to contribute to this work! I will try and make your job as easy as possible. Detailed instructions coming soon ...

Acknowledgements

This repository has been built by modifying the OpenVQA repository.

I would also like to thank Yash Khandelwal, Nikhil Shah and Chinmay Singh for their support and amazing suggestions!

Huge thanks to Streamlit for making all of this possible and for Streamlit Sharing that enables free hosting of this app! ❤️

An easy-to-use app to visualise attentions of various VQA models.

Related tags

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

Models

Requirements

Installation

How to run

How to use

Contributing

Acknowledgements

Owner

Apoorve

Anderson Acceleration for Deep Learning

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

CS50x-AI - Artificial Intelligence with Python from Harvard University

[ACM MM 2021] TSA-Net: Tube Self-Attention Network for Action Quality Assessment

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

Adaptive, interpretable wavelets across domains (NeurIPS 2021)

Automatic caption evaluation metric based on typicality analysis.

Improving Deep Network Debuggability via Sparse Decision Layers

A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 1.3x on an iPhone XS Max without sacrificing accuracy.

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Mmdetection3d Noted - MMDetection3D is an open source object detection toolbox based on PyTorch

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Trustworthy AI related projects

Pytorch implementation of "ARM: Any-Time Super-Resolution Method"

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

The repo of Feedback Networks, CVPR17