The latest OCR technologies have automatic rotation settings. A Computer Vision API
contains this setting via the property detectOrientation
. According to the tests I made this property by default is true, ie when performing the image check the API already rotates the image to capture the information.
Setting parameters to request:
var requestParameters = "language=pt&detectOrientation=true";
Using the image below:
Result with the detectOrientation
equal to true
{
"language": "pt",
"textAngle": 0.0,
"orientation": "Down",
"regions": [{
"boundingBox": "5,14,503,202",
"lines": [{
"boundingBox": "9,14,498,40",
"words": [{
"boundingBox": "9,14,27,32",
"text": "If"
}, {
"boundingBox": "46,22,59,32",
"text": "you"
}, {
"boundingBox": "118,22,59,23",
"text": "can"
}, {
"boundingBox": "191,15,73,31",
"text": "read"
}, {
"boundingBox": "276,15,59,31",
"text": "this"
}, {
"boundingBox": "348,15,70,31",
"text": "with"
}, {
"boundingBox": "431,22,76,31",
"text": "easy"
}]
}, {
"boundingBox": "5,70,503,37",
"words": [{
"boundingBox": "5,77,59,30",
"text": "you"
}, {
"boundingBox": "79,77,52,24",
"text": "are"
}, {
"boundingBox": "143,70,131,31",
"text": "twisted!"
}, {
"boundingBox": "296,70,67,32",
"text": "And"
}, {
"boundingBox": "378,70,79,32",
"text": "have"
}, {
"boundingBox": "470,77,38,24",
"text": "an"
}]
}, {
"boundingBox": "10,124,493,34",
"words": [{
"boundingBox": "10,132,156,26",
"text": "awesome"
}, {
"boundingBox": "179,125,105,33",
"text": "talent!"
}, {
"boundingBox": "310,124,71,32",
"text": "This"
}, {
"boundingBox": "391,126,27,30",
"text": "is"
}, {
"boundingBox": "432,125,71,31",
"text": "both"
}]
}, {
"boundingBox": "15,175,485,41",
"words": [{
"boundingBox": "15,175,175,34",
"text": "backwards"
}, {
"boundingBox": "204,178,59,32",
"text": "and"
}, {
"boundingBox": "278,179,108,37",
"text": "upside"
}, {
"boundingBox": "399,179,101,30",
"text": "down!"
}]
}]
}]
}
Now the same test setting the property value to false
var requestParameters = "language=pt&detectOrientation=false";
Result with the detectOrientation
equal to false
{
"language": "pt",
"textAngle": 0.0,
"orientation": "NotDetected",
"regions": [{
"boundingBox": "4,15,503,202",
"lines": [{
"boundingBox": "12,15,485,41",
"words": [{
"boundingBox": "12,22,101,30",
"text": "-iUMOP"
}, {
"boundingBox": "126,15,108,37",
"text": "ap!Sdn"
}, {
"boundingBox": "249,21,59,32",
"text": "pue"
}, {
"boundingBox": "322,22,175,34",
"text": "sp]ewpeq"
}]
}, {
"boundingBox": "9,73,493,33",
"words": [{
"boundingBox": "9,75,71,31",
"text": "l.poq"
}, {
"boundingBox": "94,75,27,30",
"text": "s!"
}, {
"boundingBox": "346,73,156,26",
"text": "awosame"
}]
}, {
"boundingBox": "4,124,503,37",
"words": [{
"boundingBox": "4,130,38,24",
"text": "ue"
}, {
"boundingBox": "149,129,67,32",
"text": "puv"
}, {
"boundingBox": "238,130,131,31",
"text": "ipaF!Ma"
}, {
"boundingBox": "381,130,52,24",
"text": "ae"
}, {
"boundingBox": "448,124,59,30",
"text": "noÁ"
}]
}, {
"boundingBox": "5,177,498,40",
"words": [{
"boundingBox": "5,178,76,31",
"text": "Ásea"
}, {
"boundingBox": "94,185,70,31",
"text": "qa!M"
}, {
"boundingBox": "248,185,73,31",
"text": "pea]"
}, {
"boundingBox": "335,186,59,23",
"text": "ueo"
}, {
"boundingBox": "407,177,59,32",
"text": "noÁ"
}, {
"boundingBox": "476,185,27,32",
"text": "JI"
}]
}]
}]
}
Here you can view the properties and perform tests using the Ocp-Apim-Subscription-Key
.
I used for my test the available example here.
I also performed the test with your image, in my opinion worked as expected, but the distribution of the texts do not help. (I didn’t put all the Json because it passed the character limit of a reply)
{
"language": "pt",
"textAngle": 0.0,
"orientation": "Down",
"regions": [{
"boundingBox": "1113,32,57,20",
"lines": [{
"boundingBox": "1113,32,57,20",
"words": [{
"boundingBox": "1113,32,57,20",
"text": "-Dep-"
}]
}]
}, {
"boundingBox": "198,59,228,18",
"lines": [{
"boundingBox": "198,59,228,18",
"words": [{
"boundingBox": "198,61,47,16",
"text": "Nome"
}, {
"boundingBox": "257,60,24,17",
"text": "do"
}, {
"boundingBox": "296,59,130,18",
"text": "Funcionario"
}]
}]
}, {
"boundingBox": "572,58,129,22",
"lines": [{
"boundingBox": "572,58,129,22",
"words": [{
"boundingBox": "572,58,129,22",
"text": "Cargo-Nivel"
}]
}]
}, {
"boundingBox": "777,33,319,94",
"lines": [{
"boundingBox": "1017,33,58,20",
"words": [{
"boundingBox": "1017,33,58,20",
"text": "Carga"
}]
}, {
"boundingBox": "1005,57,82,17",
"words": [{
"boundingBox": "1005,57,82,17",
"text": "Horaria"
}]
}, {
"boundingBox": "777,107,319,20",
"words": [{
"boundingBox": "777,108,34,17",
"text": "054"
}, {
"boundingBox": "825,107,190,18",
"text": "GLES-L.4.620-FMS"
}, {
"boundingBox": "1030,107,66,20",
"text": "111,11"
}]
}]
}, {
"boundingBox": "1363,29,106,47",
"lines": [{
"boundingBox": "1364,29,105,18",
"words": [{
"boundingBox": "1364,29,57,18",
"text": "Total"
}, {
"boundingBox": "1435,29,34,17",
"text": "das"
}]
}, {
"boundingBox": "1363,56,106,20",
"words": [{
"boundingBox": "1363,56,106,20",
"text": "Vantagens"
}]
}]
}, {
"boundingBox": "66,60,1430,1191",
"lines": [{
"boundingBox": "66,60,106,18",
"words": [{
"boundingBox": "66,60,106,18",
"text": "Matricula"
}]
}, {
"boundingBox": "1401,180,70,21",
"words": [{
"boundingBox": "1401,180,70,21",
"text": "302,36"
}]
}]
}
These documents have some standard?
– João Paulo Massa
@Joãopaulomassa have no pattern, OCR reads on any image
– Thiago Araújo
takes a long time to process ? and if you read... if you have trash, wheel, read again... until you find the right one. I don’t understand OCR, just a suggestion
– Rovann Linhalis
The reading doesn’t take long, it reads the text even though it’s turned... But then the text comes something like SOMAV (instead of VAMOS)... @Rovannlinhalis I also thought about reading the text until I get a bigger hit, but then the processing takes longer and imagine that I can read more than 1000 images every 10 minutes...
– Thiago Araújo
Are the documents generated by you? You can try reading some excerpt of the document and see if it is something "valid" as a date.
– CypherPotato
@Cypherpotato images come from common users... In order for me to check, I would need a pattern, but I don’t have any image patterns that can be sent to be read... There may be standard documents (payroll, laws, forms, reports etc).
– Thiago Araújo
Is there an error reading these documents if they are turned upside down? If so, you can implement one
try...catch
until it works properly, but if it is not the issue, you will have to implement an artificial intelligence to understand if the images are turned or not.– CypherPotato
No error occurs... It reads without pity the file. I found some things about changing the tone of the image and so on in the OS in English, but I thought the solution to check the image could be simpler.
– Thiago Araújo
If you know any English, take a look here.
– CypherPotato
Which OCR tool are you using? See if this post helps you: https://stackoverflow.com/questions/33809499/android-java-detect-text-orientation-and-rotate-image-for-ocr. So some OCR frameworks already have this feature.
– João Paulo Massa
@Joãopaulomassa I’m using Microsoft Vision. I was using the ABBYY solution, but the MS is more efficient in the matter of reading, but I did not find anything that could automatically rotate the image.
– Thiago Araújo
@Thiagoaraújo, managed to solve the problem? Which API are you using?
– João Paulo Massa
Hello @Joãopaulomassa I’m using Microsoft Vision and she reads images in up to 30° of Rotation. I haven’t solved the question of automatically detecting and turning the image (180°) yet.
– Thiago Araújo
@Thiagoaraújo, I did a test with the Computer Vision API. This is the one you are using?
– João Paulo Massa
This very @Joãopaulomassa.
– Thiago Araújo
@Thiagoaraújo. You can set parameters in the request where one of them is detectOrientation=true. The API itself identifies the position of the image and makes the necessary modifications to capture the information. If you did not use this property as a parameter, I can create an example.
– João Paulo Massa
Thanks @Joãopaulomassa but their api rotates only 30° (image correction angle). As their documentation informs: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#Recognizetext
– Thiago Araújo