make a better chinese character recognition OCR than tesseract

Overview

deep ocr

See README_en.md for English installation documentation.

只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整:

git clone https://github.com/JinpengLI/deep_ocr.git ~/deep_ocr
virtualenv ~/deep_ocr_env
source ~/deep_ocr_env/bin/activate
pip install -r ~/deep_ocr/requirements.txt
cd ~/deep_ocr && python setup.py install

测试

source ~/deep_ocr_env/bin/activate && cd ~/deep_ocr && ./bin/deep_ocr_reco data/holiday_notification.jpg -v -d

旧版说明

部分还能用,暂时保留,以后准备删除.

估计很多开发员使用tesseract做中文识别,但是结果不是一般的差,譬如下面的图片

alt text

$ tesseract -l chi_sim data/test_data.png out_test_data
看到恨多公司在招腭大改癫和机器字习胸人 v 我有3个建议 (T) 忧T ' 2个上t较靠遭
胸人就譬了 v不是越多越好 (2) 这T '2个人要能给大蒙上踝'倩邂知L目 (3) 不要招
不宣代四胸人:虹大改癫和机器字习胸v不裹目宣 (或者宣过) 大量代四v基本上就
只会忽悠了

其实现在做文字识别不是很难,特别基于深度学习,这里是这个项目的reco_chars.py脚本,基于caffe的识别效果,是不是好很多?而且代码比tesseract短很多。

$ python reco_chars.py
看很多公苘在招聘天数据和机器学习人我有个建议找个较靠谱
的人就够了不是越多越好这个人要给大家上课传递知识不要招
不写代码的人做天数据机器学习的不亲写或者写过天且代码基本上就
只会忽悠了

大家可以基于caffe训练自己的字体,系统基于这个文章开发单个字的识别:

Deep Convolutional Network for Handwritten Chinese Character Recognition

http://yuhao.im/files/Zhang_CNNChar.pdf

通过 Docker 安装

先安装docker,以下教程在Ubuntu 14.04 通过测试

https://www.docker.com/

下载deep_ocr_workspace.zip (https://pan.baidu.com/s/1nvz2wrBhttps://pan.baidu.com/s/1qYPKH3Y )

两个文件的md5sum值,用于校验文件是否成功下载。

$ md5sum deep_ocr_workspace.zip
ffeda7ea6604e7b8835c05a33fa0459e  deep_ocr_workspace.zip
$ md5sum deep_ocr_workspace.z01
ea66796c2bbdb2bec9b7ee28eb44012d  deep_ocr_workspace.z01

解压到本地硬盘,譬如到以下地方 (~/deep_ocr_workspace)

cat deep_ocr_workspace.z* > unsplit_deep_ocr_workspace.zip
unzip unsplit_deep_ocr_workspace.zip -d ~/

这个zip包含deep_ocr所有需要数据文件(由于太大了,所以放百度云了)。所有数据到解压到 ~/deep_ocr_workspace,你也可以把需要处理的数据放到这个文件夹。

基于cpu

docker pull jinpengli/deep_ocr_cpu_docker:latest

启动 docker container

docker run -ti --volume=${HOME}/deep_ocr_workspace:/workspace jinpengli/deep_ocr_cpu_docker:latest /bin/bash
cd /opt/deep_ocr
git pull origin master

volume用于mount到container里面,这样可以获取上面的识别结果。

python /opt/deep_ocr/reco_chars.py

然后可以继续你们的开发。。。。加油。。。

身份证识别

暂时不是很稳定,需要加一些语义模型。等等吧。。。。

识别图片

识别图片

执行命令

export WORKSPACE=/workspace
deep_ocr_id_card_reco --img $DEEP_OCR_ROOT/data/id_card_img.jpg             --debug_path /tmp/debug             --cls_sim ${WORKSPACE}/data/chongdata_caffe_cn_sim_digits_64_64             --cls_ua ${WORKSPACE}/data/chongdata_train_ualpha_digits_64_64

识别结果:

...
ocr res:
============================================================
name
韦小宝
============================================================
address
北京市东城区累山前街4号
紫禁城敬事房
============================================================
month
12
============================================================
minzu
汉
============================================================
year
1654
============================================================
sex
男
============================================================
id
1X21441114X221243X
============================================================
day
20

Owner
Jinpeng
Jinpeng
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

Andres 13 Dec 17, 2022
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023
Deep Learning Chinese Word Segment

引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需

2.1k Dec 23, 2022
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Pre-Recognize Library - library with algorithms for improving OCR quality.

PRLib - Pre-Recognition Library. The main aim of the library - prepare image for recogntion. Image processing can really help to improve recognition q

Alex 80 Dec 30, 2022
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

Eric Ihli 311 Dec 24, 2022
Web interface for browsing arXiv papers

Currently, arxivbox considers only major computer vision and machine learning conferences

Ankan Kumar Bhunia 12 Sep 11, 2022
This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

Fatin Shadab 2 Nov 03, 2022
Application that instantly translates sign-language to letters.

Sign Language Translator Project Description The main purpose of project is translating sign-language to letters. In accordance with this purpose we d

3 Sep 29, 2022
Learning Camera Localization via Dense Scene Matching, CVPR2021

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Hua

tangshitao 65 Dec 01, 2022
A machine learning software for extracting information from scholarly documents

GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means

Patrice Lopez 1.9k Jan 08, 2023
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

Vamsi 1 Jan 01, 2022
QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 119 Dec 02, 2022
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

188 Dec 28, 2022
The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Likert Scoring with Grade Decoupling for Long-term Action Assessment This is the code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Lon

10 Oct 21, 2022
Hand gesture detection project with aweome UI implementation.

an awesome hand gesture detection project for you to be creative! Imagination is the limit to do with this project.

AR Ashraf 39 Sep 26, 2022