Comments on Tesseract OCR: Installation and Usage on Ubuntu 16.04
Tesseract is one of the most powerful open source OCR engine available today. OCR stands for Optical Character Recognition. This tutorial shows the installation and usage of Tesseract on Ubuntu 16.04.
7 Comment(s)
Comments
Hi there--- I recommend taking a look at the Tesseract 4.0 alpha packages. There's an option to use a recognition engine based on some of Google's AI work, and a hybrid option of the traditional engine and the new AI engine, both of which are considerably more accurate than what Tesseract 3.0 uses.
Excellent comment by Nincholas (many thanks) - Tesseract 3 is quite bad even if you feed it a screenshot. On the other hand 4.0 works quite well.
suppose the command for tesseract is "tesseract [image] [filename]" then how i can give pathfor filename???
pls reply me fast
This is very helpfull.
i am having income tax document i used all kinds of resizing for improving accuracy but always few of words are missing
please reply that what to do
it's not working properly ..
Hello
for cleaning text take a look at Fred's TextCleaner script for imagemagick: http://www.fmwconcepts.com/imagemagick/textcleaner/