Extract image text without using Tesseract

Asked

Viewed 649 times

0

I want to extract a value from an image with a Python script, however, the main solution suggested is the Tesseract, which unfortunately I cannot implement, I have tried several ways. I would like to know if there is another simpler tool to perform this task with Python.

1 answer

1


The term for this is probably OCR, which can help you find other solutions (there are other terms and even solutions), but you probably forgot to install the PIL or did not know how to install the pytesser

pytesser can be installed manually:

Or via Pip:

pip install pytesseract

But before installing it you must install the:

And an example of the brief documentation/readme:

from pytesser import *

image = Image.open('fnord.tif') # Abre uma imagem usando PIL

print image_to_string(image)

Or

from pytesser import *

print image_file_to_string('fnord.tif')

Of course, this lib has not been updated since 2007, so I think it should have a number of problems with Python3 (or not even being compatible, I couldn’t test), so the solution I believe most of you will inform you is to use the opencv, in case there is a small lib ready that can install via PIP:

She was inspired by the solutions proposed in https://stackoverflow.com/q/9413216/1518921, I’m not sure if the lib itself recognizes only numbers, as it was proposed in the original question, but if the intention is to adapt or even learn this there are some examples that can study.

  • Thank you, for the answer, I gave a good study in these and other examples. I realized that I will have to use the Tesseract itself. So far I have not found a simpler solution.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.