Image Processing? Doubt on the subject and usability

Asked

Viewed 99 times

1

You guys talk all right ? I have a doubt if that’s the same name of the theme Image Processing I don’t know if you’ve ever looked there are some apps that carry a proposal like this.

There is a translation of words, that when putting the camera on the mobile phone in some text it translates to the language you choose.

My question is. What is this "magic" called, what language is used for it? Is there a library that assists development ? I even thought there would be an I.A to do this processing and such, detect objects, texts and such. Can you give me a light on that ? Thank you.

  • I don’t know if this is within the scope of the site, but I think you want cognitive computing https://en.wikipedia.org/wiki/Cognitive_computing.

  • Thanks, I’ll take a look.

  • First, you make the text interpret from an image; then you try to identify the language from the text and finally make the translation. Step by step it will be much simpler.

  • When the expert moderator says he is unsure, it is because the question is good. + 1

  • @Cool I think he wants to be called augmented reality

  • @Opencv ramaral can be used to develop only one part of the game

Show 2 more comments

2 answers

1

It is a new field who comes gaining space called Augmented Reality:

Azuma defines augmented reality as a system that:

  • combines virtual elements with the real environment;
  • is interactive and has real-time processing;
  • is conceived in three dimensions.

With the great power of processing and memory growing exponentially on mobile devices iterations between real and virtual environment by camera/audio is a reality, see Pokemon Go, of course is a game using the processing power of your smart phone to build real-time combinations of real elements (images from your camera) with virtual elements ("pets" that seem to be in front of you when you look at the images generated by the camera).

Augmented Reality is a concept, the language to do this can be any one, of course in the case of Pokemon Go, and apps for Android phones, most are written in java combining codes C/C++ with the JNI (Java Native Interface).

The example you gave about the APP that can translate words and phrases in real time pointing the phone has some steps, it seems magic, but it is not, if you dedicate yourself you can build a prototype of this type of algorithm using your smart phone, write down the recipe:

  • You need to learn about image processing

  • Linear algebra (basic calculation on dimensions x,y,z)

  • Learn language C to write the codes identifying OCR(Optical Character Recognition), Of course you will need segment each word/phrase/letter, you will only achieve this developing an OCR that will have the function of extracting the letters from an image/camera and returning in text mode, this is a complex part of the code, usually is written in C to gain performance, you can use a library calling for OpenCV, it has many functions ready to work with image processing, wants to know in detail how to build a OCR ? the steps are in this mine reply or here, if you need anything more practical I wrote a code with some concept on how to separate each letter using python here

  • Learn Java language(android)

  • Learn how to integrate codes C with Java using JNI

  • Train a large database using the whole alphabet with different font types (you want your algorithm to be robust ? wants him to be able to read and identify as many sources as possible right? Arial/Italic/times new roman/etc, etc, etc)

  • Dictionary to translate each word converted into text ( ie you will basically need a word/sentence translator)

  • After you have captured the text and translated, you will now need replace the original text with the new text, you know what are the coordinates on the plane your camera is capturing, certainly you must have kept the x,y,z positions at the time you segmented the word/phrase using the OCR now you just have to override the phrase original by translated phrase... Ready lol

Of course it’s complex rsrs, but the steps are there ...

0

Complementing the ederwander response, I think what you’re really looking for is something related to OCR, and one widely used tool is the Tesseract

The issue of training will not always be a problem because there are already good trainings available online including for pt-br, as for language too, rest assured because there are Wrappers for several, as c# and python for example.

  • a third-party OCR will not work, how will he know exactly the coordinate of the text to be replaced by real-time translation ? traditional Ocrs only capture images and convert the image text to text mode, it needs more than just doing this, it needs to know where in the image the text is so it can overwrite the translated text ...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.