How to find out if I need to rotate the image

Asked

Viewed 543 times

3

I am implementing some features to perform OCR in images and so far everything is going very well. The problem is; the OCR works well if the image is in the correct rotation, if it is upside down or rotating at 90°; the OCR ends up taking texts that I would call garbage, because they would not make sense at the time of working with them.

Imagine this image: inserir a descrição da imagem aqui

Note that she is upside down. Doing her rotation is quiet, but how to check if she is upside down, so you can rotate it?

  • These documents have some standard?

  • @Joãopaulomassa have no pattern, OCR reads on any image

  • takes a long time to process ? and if you read... if you have trash, wheel, read again... until you find the right one. I don’t understand OCR, just a suggestion

  • The reading doesn’t take long, it reads the text even though it’s turned... But then the text comes something like SOMAV (instead of VAMOS)... @Rovannlinhalis I also thought about reading the text until I get a bigger hit, but then the processing takes longer and imagine that I can read more than 1000 images every 10 minutes...

  • Are the documents generated by you? You can try reading some excerpt of the document and see if it is something "valid" as a date.

  • @Cypherpotato images come from common users... In order for me to check, I would need a pattern, but I don’t have any image patterns that can be sent to be read... There may be standard documents (payroll, laws, forms, reports etc).

  • Is there an error reading these documents if they are turned upside down? If so, you can implement one try...catch until it works properly, but if it is not the issue, you will have to implement an artificial intelligence to understand if the images are turned or not.

  • No error occurs... It reads without pity the file. I found some things about changing the tone of the image and so on in the OS in English, but I thought the solution to check the image could be simpler.

  • If you know any English, take a look here.

  • Which OCR tool are you using? See if this post helps you: https://stackoverflow.com/questions/33809499/android-java-detect-text-orientation-and-rotate-image-for-ocr. So some OCR frameworks already have this feature.

  • @Joãopaulomassa I’m using Microsoft Vision. I was using the ABBYY solution, but the MS is more efficient in the matter of reading, but I did not find anything that could automatically rotate the image.

  • @Thiagoaraújo, managed to solve the problem? Which API are you using?

  • Hello @Joãopaulomassa I’m using Microsoft Vision and she reads images in up to 30° of Rotation. I haven’t solved the question of automatically detecting and turning the image (180°) yet.

  • @Thiagoaraújo, I did a test with the Computer Vision API. This is the one you are using?

  • This very @Joãopaulomassa.

  • @Thiagoaraújo. You can set parameters in the request where one of them is detectOrientation=true. The API itself identifies the position of the image and makes the necessary modifications to capture the information. If you did not use this property as a parameter, I can create an example.

  • Thanks @Joãopaulomassa but their api rotates only 30° (image correction angle). As their documentation informs: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#Recognizetext

Show 12 more comments

2 answers

3


The latest OCR technologies have automatic rotation settings. A Computer Vision API contains this setting via the property detectOrientation. According to the tests I made this property by default is true, ie when performing the image check the API already rotates the image to capture the information.

Setting parameters to request:

var requestParameters = "language=pt&detectOrientation=true";

Using the image below: inserir a descrição da imagem aqui

Result with the detectOrientation equal to true

{
    "language": "pt",
    "textAngle": 0.0,
    "orientation": "Down",
    "regions": [{
        "boundingBox": "5,14,503,202",
        "lines": [{
            "boundingBox": "9,14,498,40",
            "words": [{
                "boundingBox": "9,14,27,32",
                "text": "If"
            }, {
                "boundingBox": "46,22,59,32",
                "text": "you"
            }, {
                "boundingBox": "118,22,59,23",
                "text": "can"
            }, {
                "boundingBox": "191,15,73,31",
                "text": "read"
            }, {
                "boundingBox": "276,15,59,31",
                "text": "this"
            }, {
                "boundingBox": "348,15,70,31",
                "text": "with"
            }, {
                "boundingBox": "431,22,76,31",
                "text": "easy"
            }]
        }, {
            "boundingBox": "5,70,503,37",
            "words": [{
                "boundingBox": "5,77,59,30",
                "text": "you"
            }, {
                "boundingBox": "79,77,52,24",
                "text": "are"
            }, {
                "boundingBox": "143,70,131,31",
                "text": "twisted!"
            }, {
                "boundingBox": "296,70,67,32",
                "text": "And"
            }, {
                "boundingBox": "378,70,79,32",
                "text": "have"
            }, {
                "boundingBox": "470,77,38,24",
                "text": "an"
            }]
        }, {
            "boundingBox": "10,124,493,34",
            "words": [{
                "boundingBox": "10,132,156,26",
                "text": "awesome"
            }, {
                "boundingBox": "179,125,105,33",
                "text": "talent!"
            }, {
                "boundingBox": "310,124,71,32",
                "text": "This"
            }, {
                "boundingBox": "391,126,27,30",
                "text": "is"
            }, {
                "boundingBox": "432,125,71,31",
                "text": "both"
            }]
        }, {
            "boundingBox": "15,175,485,41",
            "words": [{
                "boundingBox": "15,175,175,34",
                "text": "backwards"
            }, {
                "boundingBox": "204,178,59,32",
                "text": "and"
            }, {
                "boundingBox": "278,179,108,37",
                "text": "upside"
            }, {
                "boundingBox": "399,179,101,30",
                "text": "down!"
            }]
        }]
    }]
}

Now the same test setting the property value to false

var requestParameters = "language=pt&detectOrientation=false";

Result with the detectOrientation equal to false

{
    "language": "pt",
    "textAngle": 0.0,
    "orientation": "NotDetected",
    "regions": [{
        "boundingBox": "4,15,503,202",
        "lines": [{
            "boundingBox": "12,15,485,41",
            "words": [{
                "boundingBox": "12,22,101,30",
                "text": "-iUMOP"
            }, {
                "boundingBox": "126,15,108,37",
                "text": "ap!Sdn"
            }, {
                "boundingBox": "249,21,59,32",
                "text": "pue"
            }, {
                "boundingBox": "322,22,175,34",
                "text": "sp]ewpeq"
            }]
        }, {
            "boundingBox": "9,73,493,33",
            "words": [{
                "boundingBox": "9,75,71,31",
                "text": "l.poq"
            }, {
                "boundingBox": "94,75,27,30",
                "text": "s!"
            }, {
                "boundingBox": "346,73,156,26",
                "text": "awosame"
            }]
        }, {
            "boundingBox": "4,124,503,37",
            "words": [{
                "boundingBox": "4,130,38,24",
                "text": "ue"
            }, {
                "boundingBox": "149,129,67,32",
                "text": "puv"
            }, {
                "boundingBox": "238,130,131,31",
                "text": "ipaF!Ma"
            }, {
                "boundingBox": "381,130,52,24",
                "text": "ae"
            }, {
                "boundingBox": "448,124,59,30",
                "text": "noÁ"
            }]
        }, {
            "boundingBox": "5,177,498,40",
            "words": [{
                "boundingBox": "5,178,76,31",
                "text": "Ásea"
            }, {
                "boundingBox": "94,185,70,31",
                "text": "qa!M"
            }, {
                "boundingBox": "248,185,73,31",
                "text": "pea]"
            }, {
                "boundingBox": "335,186,59,23",
                "text": "ueo"
            }, {
                "boundingBox": "407,177,59,32",
                "text": "noÁ"
            }, {
                "boundingBox": "476,185,27,32",
                "text": "JI"
            }]
        }]
    }]
}

Here you can view the properties and perform tests using the Ocp-Apim-Subscription-Key.

I used for my test the available example here.

I also performed the test with your image, in my opinion worked as expected, but the distribution of the texts do not help. (I didn’t put all the Json because it passed the character limit of a reply)

{
"language": "pt",
"textAngle": 0.0,
"orientation": "Down",
"regions": [{
    "boundingBox": "1113,32,57,20",
    "lines": [{
        "boundingBox": "1113,32,57,20",
        "words": [{
            "boundingBox": "1113,32,57,20",
            "text": "-Dep-"
        }]
    }]
}, {
    "boundingBox": "198,59,228,18",
    "lines": [{
        "boundingBox": "198,59,228,18",
        "words": [{
            "boundingBox": "198,61,47,16",
            "text": "Nome"
        }, {
            "boundingBox": "257,60,24,17",
            "text": "do"
        }, {
            "boundingBox": "296,59,130,18",
            "text": "Funcionario"
        }]
    }]
}, {
    "boundingBox": "572,58,129,22",
    "lines": [{
        "boundingBox": "572,58,129,22",
        "words": [{
            "boundingBox": "572,58,129,22",
            "text": "Cargo-Nivel"
        }]
    }]
}, {
    "boundingBox": "777,33,319,94",
    "lines": [{
        "boundingBox": "1017,33,58,20",
        "words": [{
            "boundingBox": "1017,33,58,20",
            "text": "Carga"
        }]
    }, {
        "boundingBox": "1005,57,82,17",
        "words": [{
            "boundingBox": "1005,57,82,17",
            "text": "Horaria"
        }]
    }, {
        "boundingBox": "777,107,319,20",
        "words": [{
            "boundingBox": "777,108,34,17",
            "text": "054"
        }, {
            "boundingBox": "825,107,190,18",
            "text": "GLES-L.4.620-FMS"
        }, {
            "boundingBox": "1030,107,66,20",
            "text": "111,11"
        }]
    }]
}, {
    "boundingBox": "1363,29,106,47",
    "lines": [{
        "boundingBox": "1364,29,105,18",
        "words": [{
            "boundingBox": "1364,29,57,18",
            "text": "Total"
        }, {
            "boundingBox": "1435,29,34,17",
            "text": "das"
        }]
    }, {
        "boundingBox": "1363,56,106,20",
        "words": [{
            "boundingBox": "1363,56,106,20",
            "text": "Vantagens"
        }]
    }]
}, {
    "boundingBox": "66,60,1430,1191",
    "lines": [{
        "boundingBox": "66,60,106,18",
        "words": [{
            "boundingBox": "66,60,106,18",
            "text": "Matricula"
        }]
    }, {
        "boundingBox": "1401,180,70,21",
        "words": [{
            "boundingBox": "1401,180,70,21",
            "text": "302,36"
        }]
    }]
}
  • I’m on this line too, for now it’s working (even though the documentation indicates only 30°) she’s reading well what’s being needed. Thank you for your contribution, it was of great help our discussion!

  • @Thiagoaraújo, Tmj :D

1

If the image contains the EXIF pattern you can identify and rotate if necessary.

EXIF (Exchangeable Image File Format) is a specified standard and followed by digital camera manufacturers recording information on the technical characteristics of the capture of a given image in JPG or TIFF format.

Taking an image that contains EXIF through a URL:

WebClient wc = new System.Net.WebClient();
byte[] bytes = wc.DownloadData("https://github.com/recurser/exif-orientation-examples/blob/master/Portrait_3.jpg?raw=true");
MemoryStream ms = new MemoryStream(bytes);
System.Drawing.Image img = System.Drawing.Image.FromStream(ms);

Taking the orietantion and rotating:

int idOrientacao = 0x0112; // id da propridade que contem a orientação, 247 hex.
if (img.PropertyIdList.Contains(idOrientacao)) { // verificando se a imagem possui a propridade que contem a informação da orientação, imagens sem EXIF não possuem.
    int valorOrientacao =  (int) img.GetPropertyItem(idOrientacao).Value[0]; // valor que representa a orientação da imagem, tabela explicava logo abaixo.
    switch (valorOrientacao) {
            case 1:
                // Não precisa rotacionar
                break;
            case 2:
                img.RotateFlip(RotateFlipType.RotateNoneFlipX);
                break;
            case 3:
                img.RotateFlip(RotateFlipType.Rotate180FlipNone);
                break;
            case 4:
                img.RotateFlip(RotateFlipType.Rotate180FlipX);
                break;
            case 5:
                img.RotateFlip(RotateFlipType.Rotate90FlipX);
                break;
            case 6:
                img.RotateFlip(RotateFlipType.Rotate90FlipNone);
                break;
            case 7:
                img.RotateFlip(RotateFlipType.Rotate270FlipX);
                break;
            case 8:
                img.RotateFlip(RotateFlipType.Rotate270FlipNone);
                break;
        }
    // Aqui você remove a propridade que contem a orientação, já que ela foi alterada.
    img.RemovePropertyItem(idOrientacao);
}

Possible values for the variable valorOrientação: documentation here.

Rotation options: documentation here.

  • The images I read are . png or . jpg, if I convert them to EXIF they will have these properties?

  • It will have the property, but in the conversion is not identified the type of rotation, it will come as if it was not rotated.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.