Handwritten Text Recognition (HTR) using TensorFlow 2.x

Overview

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR datasets. This Neural Network model recognizes the text contained in the images of segmented texts lines.

Data partitioning (train, validation, test) was performed following the methodology of each dataset. The project implemented the HTRModel abstraction model (inspired by CTCModel) as a way to facilitate the development of HTR systems.

Notes:

  1. All references are commented in the code.
  2. This project doesn't offer post-processing, such as Statistical Language Model.
  3. Check out the presentation in the doc folder.
  4. For more information and demo run step by step, check out the tutorial on Google Colab/Drive.

Datasets supported

a. Bentham

b. IAM

c. Rimes

d. Saint Gall

e. Washington

Requirements

  • Python 3.x
  • OpenCV 4.x
  • editdistance
  • TensorFlow 2.x

Command line arguments

  • --source: dataset/model name (bentham, iam, rimes, saintgall, washington)
  • --arch: network to be used (puigcerver, bluche, flor)
  • --transform: transform dataset to the HDF5 file
  • --cv2: visualize sample from transformed dataset
  • --kaldi_assets: save all assets for use with kaldi
  • --image: predict a single image with the source parameter
  • --train: train model using the source argument
  • --test: evaluate and predict model using the source argument
  • --norm_accentuation: discard accentuation marks in the evaluation
  • --norm_punctuation: discard punctuation marks in the evaluation
  • --epochs: number of epochs
  • --batch_size: number of the size of each batch

Tutorial (Google Colab/Drive)

A Jupyter Notebook is available to demo run, check out the tutorial on Google Colab/Drive.

Sample

Bentham sample with default parameters in the tutorial file.

  1. Preprocessed image (network input)
  2. TE_L: Ground Truth Text (label)
  3. TE_P: Predicted text (network output)

Citation

If this project helped in any way in your research work, feel free to cite the following papers.

HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models (here)

This work aimed to propose a different pipeline for Handwritten Text Recognition (HTR) systems in post-processing, using two steps to correct the output text. The first step aimed to correct the text at the character level (using N-gram model). The second step had the objective of correcting the text at the word level (using a word frequency dictionary). The experiment was validated in the IAM dataset and compared to the best works proposed within this data scenario.

@inproceedings{10.1145/3395027.3419603,
    author      = {Neto, Arthur F. S. and Bezerra, Byron L. D. and Toselli, Alejandro H. and Lima, Estanislau B.},
    title       = {{HTR-Flor++:} A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models},
    booktitle   = {Proceedings of the ACM Symposium on Document Engineering 2020},
    year        = {2020},
    publisher   = {Association for Computing Machinery},
    address     = {New York, NY, USA},
    location    = {Virtual Event, CA, USA},
    series      = {DocEng '20},
    isbn        = {9781450380003},
    url         = {https://doi.org/10.1145/3395027.3419603},
    doi         = {10.1145/3395027.3419603},
}

Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems (here)

This work aimed a deep study within the research field of Natural Language Processing (NLP), and to bring its approaches to the research field of Handwritten Text Recognition (HTR). Thus, for the experiment and validation, we used 5 datasets (Bentham, IAM, RIMES, Saint Gall and Washington), 3 optical models (Bluche, Puigcerver, Flor), and 8 techniques for text correction in post-processing, including approaches statistics and neural networks, such as encoder-decoder models (seq2seq and Transformers).

@article{10.3390/app10217711,
    author  = {Neto, Arthur F. S. and Bezerra, Byron L. D. and Toselli, Alejandro H.},
    title   = {Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems},
    journal = {Applied Sciences},
    pages   = {1-29},
    month   = {10},
    year    = {2020},
    volume  = {10},
    number  = {21},
    url     = {https://doi.org/10.3390/app10217711},
    doi     = {10.3390/app10217711},
}

HDSR-Flor: A Robust End-to-End System to Solve the Handwritten Digit String Recognition Problem in Real Complex Scenarios (here)

This work aimed to propose the optical model for Handwritten Digit String Recognition (HDSR) and compare it with the state-of-the-art models. The International Conference on Frontiers of Handwriting Recognition (ICFHR) 2014 competition on HDSR were used as baselines toevaluate the effectiveness of our proposal, whose metrics, datasets and recognition methods were adopted for fair comparison. Furthermore, we also use a private dataset (Brazilian Bank Check - Courtesy Amount Recognition), and 11 different approaches from the state-of-the-art in HDSR, as well as 2 optical models from the state-of-the-art in Handwritten Text Recognition (HTR).

@article{10.1109/ACCESS.2020.3039003,
    author  = {Neto, Arthur F. S. and Bezerra, Byron L. D. and Lima, Estanislau B. and Toselli, Alejandro H.},
    title   = {{HDSR-Flor:} A Robust End-to-End System to Solve the Handwritten Digit String Recognition Problem in Real Complex Scenarios},
    journal = {IEEE Access},
    pages   = {208543-208553},
    month   = {11},
    year    = {2020},
    volume  = {8},
    isbn    = {2169-3536},
    url     = {https://doi.org/10.1109/ACCESS.2020.3039003},
    doi     = {10.1109/ACCESS.2020.3039003},
}

HTR-Flor: A Deep Learning System for Offline Handwritten Text Recognition (here)

This work aimed to propose the optical model for Handwritten Text Recognition (HTR) and compare it with the state-of-the-art models. The performance comparison was validated in 5 different datasets (Bentham, IAM, RIMES, Saint Gall and Washington). In addition, it was considered one of the best papers in the 33rd SIBGRAPI (2020).

@inproceedings{10.1109/SIBGRAPI51738.2020.00016,
    author      = {Neto, Arthur F. S. and Bezerra, Byron L. D. and Toselli, Alejandro H. and Lima, Estanislau B.},
    title       = {{HTR-Flor:} A Deep Learning System for Offline Handwritten Text Recognition},
    booktitle   = {2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)},
    pages       = {54-61},
    month       = {11},
    year        = {2020},
    location    = {Recife/Porto de Galinhas, PE, Brazil},
    series      = {SIBGRAPI' 33},
    publisher   = {IEEE Computer Society},
    address     = {Los Alamitos, CA, USA},
    url         = {https://doi.org/10.1109/SIBGRAPI51738.2020.00016},
    doi         = {10.1109/SIBGRAPI51738.2020.00016},
}
You might also like...
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

Text recognition (optical character recognition) with deep learning methods.
Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.
Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

Comments
  • Bump tensorflow from 2.9.1 to 2.9.3

    Bump tensorflow from 2.9.1 to 2.9.3

    Bumps tensorflow from 2.9.1 to 2.9.3.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump tensorflow from 2.3.0 to 2.3.1

    Bump tensorflow from 2.3.0 to 2.3.1

    Bumps tensorflow from 2.3.0 to 2.3.1.

    Release notes

    Sourced from tensorflow's releases.

    TensorFlow 2.3.1

    Release 2.3.1

    Bug Fixes and Other Changes

    Changelog

    Sourced from tensorflow's changelog.

    Release 2.3.1

    Bug Fixes and Other Changes

    Release 2.2.1

    ... (truncated)

    Commits
    • fcc4b96 Merge pull request #43446 from tensorflow-jenkins/version-numbers-2.3.1-16251
    • 4cf2230 Update version numbers to 2.3.1
    • eee8224 Merge pull request #43441 from tensorflow-jenkins/relnotes-2.3.1-24672
    • 0d41b1d Update RELEASE.md
    • d99bd63 Insert release notes place-fill
    • d71d3ce Merge pull request #43414 from tensorflow/mihaimaruseac-patch-1-1
    • 9c91596 Fix missing import
    • f9f12f6 Merge pull request #43391 from tensorflow/mihaimaruseac-patch-4
    • 3ed271b Solve leftover from merge conflict
    • 9cf3773 Merge pull request #43358 from tensorflow/mm-patch-r2.3
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(v0.0.6)
This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

Ashutosh Mahapatra 2 Feb 06, 2022
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Mathematical formulae extractor The goal of this project is to create a learning based system that takes an image of a math formula and returns corres

6 May 22, 2022
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 06, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 03, 2023
Ackermann Line Follower Robot Simulation.

Ackermann Line Follower Robot This is a simulation of a line follower robot that works with steering control based on Stanley: The Robot That Won the

Lucas Mazzetto 2 Apr 16, 2022
Smart computer vision application

Smart-computer-vision-application Backend : opencv and python Library required:

2 Jan 31, 2022
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv

1 Feb 13, 2022
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automaticall

CellProfiler 732 Dec 23, 2022
The virtual calculator will be above the live streaming from your camera

The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the

gasbaoui mohammed al amine 5 Jul 01, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021
Driver Drowsiness Detection with OpenCV & Dlib

In this project, we have built a driver drowsiness detection system that will detect if the eyes of the driver are close for too long and infer if the driver is sleepy or inactive.

Mansi Mishra 4 Oct 26, 2022
Text Detection from images using OpenCV

EAST Detector for Text Detection OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel

Abhishek Singh 88 Oct 20, 2022
Document Layout Analysis

Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P

QURATOR-SPK 198 Dec 29, 2022
Web interface for browsing arXiv papers

Currently, arxivbox considers only major computer vision and machine learning conferences

Ankan Kumar Bhunia 12 Sep 11, 2022
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

Deep Insight 99 Nov 01, 2022