Key information extraction from invoice document with Graph Convolution Network

Last update: Dec 16, 2022

Overview

Key Information Extraction from Scanned Invoices

Key information extraction from invoice document with Graph Convolution Network

Related blog post from my Viblo account: https://viblo.asia/p/djeZ1yPGZWz

Models

Background subtraction: U2Net
Image alignment: based-on output of text-detection & cv2
Text detection: CRAFT and an in-house text-detection model
Text recognition: VietOCR and an in-house text-recognition model
KIE: Graph Convolution

Currently, I dont have the invoice-direction classifier model. But you can also develop a model to rotate the image if the image is rotated horizontally or upside down.

Pretrained model

Google Drive

Data

MC-OCR, a Vietnamese receipts dataset: https://aihub.vn/competitions/1
Preprocessed data: Google Drive

Pipeline

TODO

Command

Create virtual environment using conda or virtualenv

# with virtualenv
virtualenv -p python3 invoice_env
# activate environment
source invoice_env/bin/activate
# install prerequisite libraries
pip install -r requirements.txt

# 1st command, run API
make serve
# 2nd command, run web-gui with streamlit
make runapp

Then access the localhost server at: 0.0.0.0:7778

Preview

TODO

Add preprocess data script

Reference

MC-OCR dataset: https://aihub.vn/competitions/1
U2Net: https://github.com/xuebinqin/U-2-Net
CRAFT: https://github.com/clovaai/CRAFT-pytorch
VietOCR: https://github.com/pbcquoc/vietocr
Benchmarking GNNs: https://github.com/graphdeeplearning/benchmarking-gnns
PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Key information extraction from invoice document with Graph Convolution Network

Related tags

Overview

Key Information Extraction from Scanned Invoices

Models

Pretrained model

Data

Pipeline

Command

Preview

TODO

Reference

Owner

Phan Hoang

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

TensorFlow Tutorials with YouTube Videos

Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Modular Probabilistic Programming on MXNet

Hough Transform and Hough Line Transform Using OpenCV

Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

Train Yolov4 using NBX-Jobs

Multi-objective gym environments for reinforcement learning.

This repository contains all code and data for the Inside Out Visual Place Recognition task

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

Image Captioning using CNN ,LSTM and Attention

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior

Unsupervised Discovery of Object Radiance Fields

RGB-D Local Implicit Function for Depth Completion of Transparent Objects

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Contains code for the paper "Vision Transformers are Robust Learners".

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

Implementation for the paper SMPLicit: Topology-aware Generative Model for Clothed People (CVPR 2021)