OCR system for Arabic language that converts images of typed text to machine-encoded text.

Last update: Jan 05, 2023

Overview

Arabic OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.
The system currently supports only letters (29 letters) ا-ى , لا.
The system aims to solve a simpler problem of OCR with images that contain only Arabic characters (check the dataset link below to see a sample of the images).

Setup

Install python then run this command:

pip install -r requirements.txt

Run

Put the images in src/test directory
Go to src directory and run the following command
```
python OCR.py
```
Output folder will be created with:
- text folder which has text files corresponding to the images.
- running_time file which has the time taken to process each image.

Pipeline

Dataset

Link to dataset of images and the corresponding text: here.
We used 1000 images to generate character dataset that we used for training.

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

Average accuracy: 95%.
Average time per image: 16 seconds.

NOTE

We achieved these results when we used only the flatten image as feature.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Related tags

Overview

Arabic OCR

Setup

Run

Pipeline

Dataset

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

References

Owner

Hussein Youssef

Fun program to overlay a mask to yourself using a webcam

A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

Generate a list of papers with publicly available source code in the daily arxiv

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Hand Detection and Finger Detection on Live Feed

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

An organized collection of tutorials and projects created for aspriring computer vision students.

Table Extraction Tool

Handwritten Text Recognition (HTR) using TensorFlow 2.x

A Joint Video and Image Encoder for End-to-End Retrieval

Qrcode Attendence System with Opencv and Pyzbar

Deep Learning Chinese Word Segment

Creating a virtual tv using opencv in python3.

A simple python program to record security cam footage by detecting a face and body of a person in the frame.

Python library to extract tabular data from images and scanned PDFs

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Deep learning based page layout analysis

Open Source Computer Vision Library

fishington.io bot with OpenCV and NumPy