How to find out if I need to rotate the image

Question

How to find out if I need to rotate the image

Asked 7 years, 11 months ago

Viewed 543 times

3

I am implementing some features to perform OCR in images and so far everything is going very well. The problem is; the OCR works well if the image is in the correct rotation, if it is upside down or rotating at 90°; the OCR ends up taking texts that I would call garbage, because they would not make sense at the time of working with them.

Imagine this image:

Note that she is upside down. Doing her rotation is quiet, but how to check if she is upside down, so you can rotate it?

These documents have some standard?

– João Paulo Massa

2017/08/18 at 18:28
@Joãopaulomassa have no pattern, OCR reads on any image

– Thiago Araújo

2017/08/18 at 18:37
takes a long time to process ? and if you read... if you have trash, wheel, read again... until you find the right one. I don’t understand OCR, just a suggestion

– Rovann Linhalis

2017/08/18 at 18:38
The reading doesn’t take long, it reads the text even though it’s turned... But then the text comes something like SOMAV (instead of VAMOS)... @Rovannlinhalis I also thought about reading the text until I get a bigger hit, but then the processing takes longer and imagine that I can read more than 1000 images every 10 minutes...

– Thiago Araújo

2017/08/18 at 18:40
Are the documents generated by you? You can try reading some excerpt of the document and see if it is something "valid" as a date.

– CypherPotato

2017/08/18 at 19:01
@Cypherpotato images come from common users... In order for me to check, I would need a pattern, but I don’t have any image patterns that can be sent to be read... There may be standard documents (payroll, laws, forms, reports etc).

– Thiago Araújo

2017/08/18 at 19:05
Is there an error reading these documents if they are turned upside down? If so, you can implement one try...catch until it works properly, but if it is not the issue, you will have to implement an artificial intelligence to understand if the images are turned or not.

– CypherPotato

2017/08/18 at 19:12
No error occurs... It reads without pity the file. I found some things about changing the tone of the image and so on in the OS in English, but I thought the solution to check the image could be simpler.

– Thiago Araújo

2017/08/18 at 19:13
If you know any English, take a look here.

– CypherPotato

2017/08/18 at 19:13
Which OCR tool are you using? See if this post helps you: https://stackoverflow.com/questions/33809499/android-java-detect-text-orientation-and-rotate-image-for-ocr. So some OCR frameworks already have this feature.

– João Paulo Massa

2017/08/18 at 19:14
@Joãopaulomassa I’m using Microsoft Vision. I was using the ABBYY solution, but the MS is more efficient in the matter of reading, but I did not find anything that could automatically rotate the image.

– Thiago Araújo

2017/08/18 at 19:16
@Thiagoaraújo, managed to solve the problem? Which API are you using?

– João Paulo Massa

2017/08/21 at 14:37
Hello @Joãopaulomassa I’m using Microsoft Vision and she reads images in up to 30° of Rotation. I haven’t solved the question of automatically detecting and turning the image (180°) yet.

– Thiago Araújo

2017/08/21 at 14:39
@Thiagoaraújo, I did a test with the Computer Vision API. This is the one you are using?

– João Paulo Massa

2017/08/21 at 14:41
This very @Joãopaulomassa.

– Thiago Araújo

2017/08/21 at 14:42
@Thiagoaraújo. You can set parameters in the request where one of them is detectOrientation=true. The API itself identifies the position of the image and makes the necessary modifications to capture the information. If you did not use this property as a parameter, I can create an example.

– João Paulo Massa

2017/08/21 at 14:52
Thanks @Joãopaulomassa but their api rotates only 30° (image correction angle). As their documentation informs: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#Recognizetext

– Thiago Araújo

2017/08/21 at 15:01

Show 12 more comments

2 answers

3

The latest OCR technologies have automatic rotation settings. A Computer Vision API contains this setting via the property detectOrientation. According to the tests I made this property by default is true, ie when performing the image check the API already rotates the image to capture the information.

Setting parameters to request:

var requestParameters = "language=pt&detectOrientation=true";

Using the image below:

Result with the detectOrientation equal to true

{
    "language": "pt",
    "textAngle": 0.0,
    "orientation": "Down",
    "regions": [{
        "boundingBox": "5,14,503,202",
        "lines": [{
            "boundingBox": "9,14,498,40",
            "words": [{
                "boundingBox": "9,14,27,32",
                "text": "If"
            }, {
                "boundingBox": "46,22,59,32",
                "text": "you"
            }, {
                "boundingBox": "118,22,59,23",
                "text": "can"
            }, {
                "boundingBox": "191,15,73,31",
                "text": "read"
            }, {
                "boundingBox": "276,15,59,31",
                "text": "this"
            }, {
                "boundingBox": "348,15,70,31",
                "text": "with"
            }, {
                "boundingBox": "431,22,76,31",
                "text": "easy"
            }]
        }, {
            "boundingBox": "5,70,503,37",
            "words": [{
                "boundingBox": "5,77,59,30",
                "text": "you"
            }, {
                "boundingBox": "79,77,52,24",
                "text": "are"
            }, {
                "boundingBox": "143,70,131,31",
                "text": "twisted!"
            }, {
                "boundingBox": "296,70,67,32",
                "text": "And"
            }, {
                "boundingBox": "378,70,79,32",
                "text": "have"
            }, {
                "boundingBox": "470,77,38,24",
                "text": "an"
            }]
        }, {
            "boundingBox": "10,124,493,34",
            "words": [{
                "boundingBox": "10,132,156,26",
                "text": "awesome"
            }, {
                "boundingBox": "179,125,105,33",
                "text": "talent!"
            }, {
                "boundingBox": "310,124,71,32",
                "text": "This"
            }, {
                "boundingBox": "391,126,27,30",
                "text": "is"
            }, {
                "boundingBox": "432,125,71,31",
                "text": "both"
            }]
        }, {
            "boundingBox": "15,175,485,41",
            "words": [{
                "boundingBox": "15,175,175,34",
                "text": "backwards"
            }, {
                "boundingBox": "204,178,59,32",
                "text": "and"
            }, {
                "boundingBox": "278,179,108,37",
                "text": "upside"
            }, {
                "boundingBox": "399,179,101,30",
                "text": "down!"
            }]
        }]
    }]
}

Now the same test setting the property value to false

var requestParameters = "language=pt&detectOrientation=false";

Result with the detectOrientation equal to false

{
    "language": "pt",
    "textAngle": 0.0,
    "orientation": "NotDetected",
    "regions": [{
        "boundingBox": "4,15,503,202",
        "lines": [{
            "boundingBox": "12,15,485,41",
            "words": [{
                "boundingBox": "12,22,101,30",
                "text": "-iUMOP"
            }, {
                "boundingBox": "126,15,108,37",
                "text": "ap!Sdn"
            }, {
                "boundingBox": "249,21,59,32",
                "text": "pue"
            }, {
                "boundingBox": "322,22,175,34",
                "text": "sp]ewpeq"
            }]
        }, {
            "boundingBox": "9,73,493,33",
            "words": [{
                "boundingBox": "9,75,71,31",
                "text": "l.poq"
            }, {
                "boundingBox": "94,75,27,30",
                "text": "s!"
            }, {
                "boundingBox": "346,73,156,26",
                "text": "awosame"
            }]
        }, {
            "boundingBox": "4,124,503,37",
            "words": [{
                "boundingBox": "4,130,38,24",
                "text": "ue"
            }, {
                "boundingBox": "149,129,67,32",
                "text": "puv"
            }, {
                "boundingBox": "238,130,131,31",
                "text": "ipaF!Ma"
            }, {
                "boundingBox": "381,130,52,24",
                "text": "ae"
            }, {
                "boundingBox": "448,124,59,30",
                "text": "noÁ"
            }]
        }, {
            "boundingBox": "5,177,498,40",
            "words": [{
                "boundingBox": "5,178,76,31",
                "text": "Ásea"
            }, {
                "boundingBox": "94,185,70,31",
                "text": "qa!M"
            }, {
                "boundingBox": "248,185,73,31",
                "text": "pea]"
            }, {
                "boundingBox": "335,186,59,23",
                "text": "ueo"
            }, {
                "boundingBox": "407,177,59,32",
                "text": "noÁ"
            }, {
                "boundingBox": "476,185,27,32",
                "text": "JI"
            }]
        }]
    }]
}

Here you can view the properties and perform tests using the Ocp-Apim-Subscription-Key.

I used for my test the available example here.

I also performed the test with your image, in my opinion worked as expected, but the distribution of the texts do not help. (I didn’t put all the Json because it passed the character limit of a reply)

{
"language": "pt",
"textAngle": 0.0,
"orientation": "Down",
"regions": [{
    "boundingBox": "1113,32,57,20",
    "lines": [{
        "boundingBox": "1113,32,57,20",
        "words": [{
            "boundingBox": "1113,32,57,20",
            "text": "-Dep-"
        }]
    }]
}, {
    "boundingBox": "198,59,228,18",
    "lines": [{
        "boundingBox": "198,59,228,18",
        "words": [{
            "boundingBox": "198,61,47,16",
            "text": "Nome"
        }, {
            "boundingBox": "257,60,24,17",
            "text": "do"
        }, {
            "boundingBox": "296,59,130,18",
            "text": "Funcionario"
        }]
    }]
}, {
    "boundingBox": "572,58,129,22",
    "lines": [{
        "boundingBox": "572,58,129,22",
        "words": [{
            "boundingBox": "572,58,129,22",
            "text": "Cargo-Nivel"
        }]
    }]
}, {
    "boundingBox": "777,33,319,94",
    "lines": [{
        "boundingBox": "1017,33,58,20",
        "words": [{
            "boundingBox": "1017,33,58,20",
            "text": "Carga"
        }]
    }, {
        "boundingBox": "1005,57,82,17",
        "words": [{
            "boundingBox": "1005,57,82,17",
            "text": "Horaria"
        }]
    }, {
        "boundingBox": "777,107,319,20",
        "words": [{
            "boundingBox": "777,108,34,17",
            "text": "054"
        }, {
            "boundingBox": "825,107,190,18",
            "text": "GLES-L.4.620-FMS"
        }, {
            "boundingBox": "1030,107,66,20",
            "text": "111,11"
        }]
    }]
}, {
    "boundingBox": "1363,29,106,47",
    "lines": [{
        "boundingBox": "1364,29,105,18",
        "words": [{
            "boundingBox": "1364,29,57,18",
            "text": "Total"
        }, {
            "boundingBox": "1435,29,34,17",
            "text": "das"
        }]
    }, {
        "boundingBox": "1363,56,106,20",
        "words": [{
            "boundingBox": "1363,56,106,20",
            "text": "Vantagens"
        }]
    }]
}, {
    "boundingBox": "66,60,1430,1191",
    "lines": [{
        "boundingBox": "66,60,106,18",
        "words": [{
            "boundingBox": "66,60,106,18",
            "text": "Matricula"
        }]
    }, {
        "boundingBox": "1401,180,70,21",
        "words": [{
            "boundingBox": "1401,180,70,21",
            "text": "302,36"
        }]
    }]
}

I’m on this line too, for now it’s working (even though the documentation indicates only 30°) she’s reading well what’s being needed. Thank you for your contribution, it was of great help our discussion!

– Thiago Araújo

2017/08/21 at 17:37
@Thiagoaraújo, Tmj :D

– João Paulo Massa

2017/08/21 at 17:45

Browser other questions tagged c# image ocr

You are not signed in. Login or sign up in order to post.

by Diego Schmidt • **1,059** points · Answer 1 · 2017-08-18T20:02:57+00:00

If the image contains the EXIF pattern you can identify and rotate if necessary.

EXIF (Exchangeable Image File Format) is a specified standard and followed by digital camera manufacturers recording information on the technical characteristics of the capture of a given image in JPG or TIFF format.

Taking an image that contains EXIF through a URL:

WebClient wc = new System.Net.WebClient();
byte[] bytes = wc.DownloadData("https://github.com/recurser/exif-orientation-examples/blob/master/Portrait_3.jpg?raw=true");
MemoryStream ms = new MemoryStream(bytes);
System.Drawing.Image img = System.Drawing.Image.FromStream(ms);

Taking the orietantion and rotating:

int idOrientacao = 0x0112; // id da propridade que contem a orientação, 247 hex.
if (img.PropertyIdList.Contains(idOrientacao)) { // verificando se a imagem possui a propridade que contem a informação da orientação, imagens sem EXIF não possuem.
    int valorOrientacao =  (int) img.GetPropertyItem(idOrientacao).Value[0]; // valor que representa a orientação da imagem, tabela explicava logo abaixo.
    switch (valorOrientacao) {
            case 1:
                // Não precisa rotacionar
                break;
            case 2:
                img.RotateFlip(RotateFlipType.RotateNoneFlipX);
                break;
            case 3:
                img.RotateFlip(RotateFlipType.Rotate180FlipNone);
                break;
            case 4:
                img.RotateFlip(RotateFlipType.Rotate180FlipX);
                break;
            case 5:
                img.RotateFlip(RotateFlipType.Rotate90FlipX);
                break;
            case 6:
                img.RotateFlip(RotateFlipType.Rotate90FlipNone);
                break;
            case 7:
                img.RotateFlip(RotateFlipType.Rotate270FlipX);
                break;
            case 8:
                img.RotateFlip(RotateFlipType.Rotate270FlipNone);
                break;
        }
    // Aqui você remove a propridade que contem a orientação, já que ela foi alterada.
    img.RemovePropertyItem(idOrientacao);
}

Possible values for the variable valorOrientação: documentation here.

Rotation options: documentation here.