How to trace letters and cut them?

Asked

Viewed 389 times

11

I have this image below. As you can see, they are together. In this case, it’s a "7", "h" and "a". It is possible to scan the image to find them and then cut them where exactly you found them?

inserir a descrição da imagem aqui

I’m using the Opencv, OCR and Tesseract library in Java. I’ve looked for something like findcontours, convexHull... but I didn’t get the result I want.

  • I’m very interested in an answer to that! Some time ago I had to implement a data import and what broke was just this security token! Palliatively (solved my problem) I presented this image for the user of my system to type.

  • 5

    This kind of image was produced that way for the exact purpose that you couldn’t do what you want to do. Apparently, the system that generated it has achieved its goal.

  • Actually, this image I built in Photoshop myself.

  • Dear friends, and to all who come to ask me the same thing. My interest is only academic, I’ve been studying these libraries for a while, and I’m testing all the methods they can offer me

  • This is precisely the main purpose of CAPTCHAS: Not to allow a machine (or processing algorithm) to decipher them. However, there are some techniques based on "Human-Based Computing" and "Artificial Intelligence" that proved to be very efficient in solving the problem. Anyway, the answer to your question is still a big challenge.

  • This has already been asked here: https://answall.com/questions/11757/decoderfor captcha

  • If anyone has the perfect answer to this, they will win the Nobel Prize...

  • Have you tried using a convolutional neural network?

Show 3 more comments

1 answer

2

This problem is typical of an OCR. Known as character segmentation (char segmentation algorithms, if you search on google), this problem is studied since the years 50.

Overall, the algorithm that both Tesseract and ocropus use (the 2 best-known opensource Ocrs) does the following:

  1. Find the connected components - Scan the image and find all pixels that have neighbors. So characters that do not stick together can be easily separated.

  2. For each connected component, compute a box that frames all pixels of the component (bounding box).

Only, as you mentioned, this algorithm fails in 2 cases. When the characters stick and when a character breaks in 2.

To solve this problem, ocrs use a technique called Seam Carving. In general terms, it is an algorithm that calculates a path that costs less "energy", or goes through fewer obstacles. In the case of character segmentation, a path with less obstruction is a vertical path with less of the character contour.

For example, let’s imagine that we have the following picture:

Flight Engineer

The paths with the least obstacles are as follows::

inserir a descrição da imagem aqui

In this image, the possible cuts are in black in the sentence and at the bottom a graph showing the amount of obstacles calculated by the algorithm for each column of pixels.

The result of this algorithm is not the cuts themselves, but the cutting possibilities. Some are right and some are not. The next step of the OCR aims to solve this problem.

At this stage, the OCR generates several possibilities of clipping, joining or separating the segments found, as in the image below:

inserir a descrição da imagem aqui

This generates dozens of possibilities for OCR. But how does OCR know which is the correct cut? Simple, it plays every possibility on a character recognizer, which will return a degree of confidence that that cutout be a letter.

The OCR then picks up all the clippings in which the sum of the degrees of confidence is as high as possible.

PS: I’m not going to detail every algorithm here because it would give a 5,000 word post, but you can study all of these algorithms implemented in Ocropus.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.