Comments on Tesseract OCR: Installation and Usage on Ubuntu 16.04

Tesseract is one of the most powerful open source OCR engine available today. OCR stands for Optical Character Recognition. This tutorial shows the installation and usage of Tesseract on Ubuntu 16.04.

7 Comment(s)

Add comment

Please register in our forum first to comment.

Comments

By: Nincholas

Hi there--- I recommend taking a look at the Tesseract 4.0 alpha packages. There's an option to use a recognition engine based on some of Google's AI work, and a hybrid option of the traditional engine and the new AI engine, both of which are considerably more accurate than what Tesseract 3.0 uses.

By: Thanks

Excellent comment by Nincholas (many thanks) - Tesseract 3 is quite bad even if you feed it a screenshot. On the other hand 4.0 works quite well.

By: jainam

suppose the command for tesseract is "tesseract [image] [filename]"  then how i can  give pathfor filename???

pls reply me fast

By: Phani

This is very helpfull.

By: akshat dashore

i am having income tax document i used all kinds of resizing for improving accuracy but always few of words are missing 

 

please reply that what to do

By: jay

it's not working properly ..

By: Zurab Gvishiani

Hello

for cleaning text take a look at Fred's TextCleaner script for imagemagick: http://www.fmwconcepts.com/imagemagick/textcleaner/