documentReader

Program that helps the visually impaired by using OCR to scan the t5-22b tax file and extract the contents in the form of an MP3 file. Tesseract OCR was chosen so that the program is capable of reading the contents of scanned images, which PDF parsers cannot access, improving accessibility and flexibility with which documents can be read

Technologies used

Modules: Gtts, numpy, matplotlib, pytesseract, playsound

How it works

The program first converts the pdf in the "testingPDF" directory to a set of jpg images, storing the output in the PDFoutput directory. Then vertical and horizontal lines are mapped to the grids in the original file (outputs of the mapping can be viewed in vertical_lines.jpg and horizontal_lines.jpg). The vertical line and horizontal line maps are then combined to form a grid map, which can be viewed in (img_final_bin.jpg)

After (img_final_bin.jpg) is created, contour mapping is performed to find the boxes which need to be scanned by the Tesseract model. The mapped boxes are shown in (result.png). After each box is scanned, the text and key information is stored in imageText.txt, which is then converted to an MP3 file using Google-text-to-speech (gtts)

How to test

Just put any modification of the t5-22b tax pdf in the testingPDF file and run the main.py file. The program will read the form and automatically replace the current image, text, and MP3 files.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
PDFoutput		PDFoutput
__pycache__		__pycache__
testingImages		testingImages
testingPDF		testingPDF
.DS_Store		.DS_Store
EDSR_x4.pb		EDSR_x4.pb
README.md		README.md
horizontal_lines.jpg		horizontal_lines.jpg
imageText.txt		imageText.txt
img_final_bin.jpg		img_final_bin.jpg
main.py		main.py
result.png		result.png
scannedFile.mp3		scannedFile.mp3
sort_funct.py		sort_funct.py
vertical_lines.jpg		vertical_lines.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

documentReader

Technologies used

How it works

How to test

About

Uh oh!

Releases

Packages

Languages

JwuCode/documentReader

Folders and files

Latest commit

History

Repository files navigation

documentReader

Technologies used

How it works

How to test

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages