Complete portable pipeline for masking of Aadhaar Number adhering to Govt. Privacy Guidelines.

Overview

Aadhaar Number Masking Pipeline

Implementation of a complete pipeline that masks the Aadhaar Number in given images to adhere to Govt. of India's Privacy Guidelines for storage of Aadhaar Card images in digital repository. The following project was carried out as an internship for Muthoot Finance. We make use of the open source packages CRAFT text detector | Paper | Pretrained Model | Github Repo provided by Clova AI Research for OSD and combine a heurestic model with pytesseract OCR for masking.

Rohit Ranjan, Ram Sundaram.

Sample Results

teaser

Versions

The search for the best masking pipeline led us to experiment with several different approaches. We have documented our experiments in other branches.

Branch(->model) Speed/Performance Pipeline
main Best performing CRAFT + pytesseract + dimensional heuristics
CNN_OCR->cnn_model Fastest masking CRAFT + LeNet trained by us
CNN_OCR->cnn_model_2 Fastest masking CRAFT + LeNet trained by us
UNET_OCDR Theoretically Fastest but trained model unavailable** UNet

**We proposed and implemented a pipeline which uses a single UNet model for achieving a desirable mask. A single model would have made the inference very fast and real time use capable on mobile devices. Training meant creating a dataset since the company could not legally provide us the needed data. After several trials, we halted work on this model because with barely 150 unique datapoints available, a data hungry UNet Model is simply unsatiable for now.

Datasets

Lenet was trained on our self-created labelled dataset | labels.

Getting started

Install dependencies

Requirements

  • torch
  • opencv-python
  • tesseract-ocr
  • check requirements.txt
pip install -r requirements.txt

Test instructions

  • Clone this repository
git clone https://github.com/thefurorjuror/Aadhaar_Masker.git
  • Run on an image folder
python [folder path to the cloned repo]/masker.py --test_folder=[folder path to test images] --output_folder=[folder path to output images] --cuda=[True/False]
#Example- When one is inside the cloned repo
python masker.py --test_folder=./images/ --output_folder=./output/ --cuda=True

cuda is set to False by default.

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

License

Copyright (c) 2021-present Rohit Ranjan & Ram Sundaram.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
A Telegram Bot to manage your music channel with some cool features.

Music Channel Manager V2 A Telegram Bot to manage your music channel with some cool features like appending your predefined username to the musics tag

11 Oct 21, 2022
Enigma simulator with python and clean code.

Enigma simulator with python and clean code.

Mohammad Dori 3 Jul 21, 2022
inventory replenishment for a hospital.

Inventory-Replenishment Inventory-Replenishment for a hospital that would like to explore how advanced anlytics may help automate their decision proce

1 Jan 09, 2022
Telegram music & video bot direct play music

Telegram music & video bot direct play music

noinoi-X 1 Dec 28, 2021
Pythonic event-processing library based on decorators

Process Events In Style This library aims to simplify the common pattern of event processing. It simplifies the process of filtering, dispatching and

Nicolas Marier 3 Sep 01, 2022
Simple stock price analytics

mune · Mune is an open source python web application built to analyze stocks, named after Homma Munehisa. Currently, the forecasting component is powe

Richard Hong 14 Aug 30, 2021
A simple Discord bot that notifies users of new Abitti versions

A simple Discord bot that notifies users of new Abitti versions. New features might be added later on. If you have good ideas, feel free to do a PR.

1 Feb 11, 2022
An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From

SAF ONE 206 Jan 03, 2023
Just a python library to make reddit post caching easier

Reddist Just a python library to make reddit post caching easier. Caching Options In Memory Caching Redis Caching Pickle Caching Usage Installation: D

Samrid Pandit 3 Jan 16, 2022
"Nesse projeto criei uma automação para abrir as tarefas no Jira em massa pegando de uma determinada fila do Zendesk."

automacao-Zendesk "Nesse projeto criei uma automação para abrir as tarefas no Jira em massa pegando de uma determinada fila do Zendesk." en-us "In thi

tokoyamy 1 Dec 20, 2021
A VCVideoPlayer Bot for Telegram made with 💞 By @ProErrorXD

VC Video Player How To Host ✨ Heroku Deploy ✨ The easiest way to deploy this Bot is via Heroku. Credit 🔥 |🇮🇳 Louis |🇮🇳 Sammy |🇮🇳 Blaze Marsha

丂ムᄊᄊƳ 95 May 17, 2022
数字货币动态趋势网格,随着行情变动。目前实盘月化10%。目前支持币安,未来上线火币、OKEX。

数字货币动态趋势网格,随着行情变动。目前实盘月化10%。目前支持币安,未来上线火币、OKEX。

幸福村的码农 98 Dec 27, 2022
A Telegram Bot with(Forwarder Bot + User Bot + More Features )

A Telegram Bot with(Forwarder Bot + User Bot + More Features )

Kaif 3 Feb 16, 2022
Send pm to Admin - Telegram

Send pm to Admin - Telegram

Ahoora 3 Nov 17, 2022
List of twitch bots n bigots

This is a collection of bot account names NamelistMASTER contains all the names we reccomend you ban in your channel Sometimes people get on that list

62 Sep 05, 2021
BLYRIC is a Twitter bot that tweets a song lyric every night.

BLYRIC BLYRIC, a bot that tweets a song lyric every night. Follow on Twitter: @blyric_ Overview BLYRIC is a Twitter bot that tweets a song quote every

Bruno Kenzo Hyodo 6 Oct 05, 2022
Male' Map Telegram Bot

Male' Map TelegramBot A simple TelegramBot to fetch residential addresses in Male', Maldives. The bot can be queried inline or directly. sample .env f

Naail Abdul Rahman 12 Nov 25, 2022
Webb-Tracker-Bot - This is a discord bot that displays current progress of the James Webb Space Telescope.

Webb-Tracker-Bot - This is a discord bot that displays current progress of the James Webb Space Telescope.

Copperbotte 1 Jan 05, 2022
A Python bot that uses the Reddit API to send users inspiring messages.

AnonBot By Edric Antoine A Python bot that uses the Reddit API to send users inspiring messages. When a message includes 'What would Anon do?', the bo

1 Jan 05, 2022