Consult CPF in the IRS

Asked

Viewed 3,059 times

3

I would like to somehow consult the name of the person through the CPF on the website of the recipe. There are apps on android that just put the CPF and it brings the name and the cadastral situation. But on the site has a captcha.

Someone has done something similar ?

in php, jquery

  • 2
  • @Ciganomorrisonmendez, I did not but I have a friend q has already done it, q he does and download the page content to a stream string treat the logic filling the fields. Already the CAPTCHA has been circumvented with the technique of taking the audio and using a feature to discover the letter q unfortunately I can not say as by ethics, but the path and this

  • If the problem is CAPTCHA itself, there’s already some questions around here which deal with the same subject.

  • 1

    There is a solution through API https://gist.github.com/Pompeu/ce58d61cde1e51a1da164404d667d458

2 answers

6


I was in doubt about answering this question, pass knowledge is never too much, who reads decides whether to implement and on the possible illegalities, I will only address how technically it would be possible to go through systems like this.

First of all you will really need to build a robot that automates the methods to go through Captcha, nowadays the vast majority of the options are image and audio.

Imagery

inserir a descrição da imagem aqui

Do visual analysis and try to find out what the existing patterns are, just observing (give refresh several times in captcha) It can be concluded that the noises seem to be a problem, but what could you do to improve it ? unfortunately do not expect to find anything ready you will have to develop the algorithms before hand ensure that the image is actually in black and white, look for black pixels connected and remove connected pixels smaller than an X size, this will clear all the dots in the image, the next step would now be to remove the lines that cut and sometimes cross the letters of the image, a solution is to search the image matrix which black pixels are in a straight line and clean the pixels that are in a straight path greater than the X value defined by you. Well if you did it successfully your image will be clean and you can submit the same to a OCR, you will need to know a little bit about matrix calculations and about signal processing in images.

Audio

inserir a descrição da imagem aqui

That Audio says N H J K H j

There is not much alternative you will need to analyze and treat the audio too, first step has a lot of noise and wheezing when you listen to the audio, the first problem is a loud noise that is more noticeable between each spoken letter, this is an attempt to confuse algorithms that try to segment (crop) each letter, while doing an analysis of Fourier in the first 2048 frames I can say in which frequencies and what is the average amplitude that generates this strange noise:

inserir a descrição da imagem aqui

Okay, it’s really a noise, frequencies oscillating across the spectrum and with a linear peak of about 78 magnitude, but on average the frequencies oscillate below 10 magnitude .

OK but what if you do another analysis of Fourier in the piece of the first letter N:

inserir a descrição da imagem aqui

What you can do here is use a high pass filter and let pass only frequencies above 50Hz (50hz is the minimum that the human voice emits), this way you face already cancels out any noise happening in the low frequencies, between 120 and 600 hertz you notice how for the letter N the frequencies concentrate more, if you do an analysis in all audio letters of this captcha will not find frequencies higher than 1000hz, we can then build an equalizer that attenuates or eliminates noise happening above this track, this will give you a cleaner audio, until then you just treated the audio, you will need to extract and treat the audio for each letter in the hand type(A.wav, B.wav), done this Automate the process, make an algorithm that automatically encodes the audio by cropping where each letter starts and ends, treat it the same way it was done with the files of each letter and calculate the cross-correlation, the result of this correlation will tell you what the letter is.

Summarizing there is no easy way, you will have a big job and still run the risk that the site changes the captcha system (algorithm).

2

Recently I did this in C#, but I did not "pass over" CAPTCHA, I brought the image and asked the user to type, as it is on the web site, but within my application.

I saw one of these days a solution that "broke" CAPTCHA, but I believe it is somewhat illegal and therefore I won’t even indicate (it was also in C#). The guy had reverse engineered an Android APK and obtained the application key, used in the requests to do the search.

I believe that there are already solutions ready on the internet, but if you want to implement yours, it’s like @Ciganomorrisonmendez said. You will make the so-called "web Scrapping", reading information from a web page and sending requests directly to their server, simulating the action in the browser.

For this, you will probably use the CURL function. I also advise using Fiddler to inspect the requests and understand the flow of information traffic.

Hug.

  • Hello Joel, there really are these two ways, but captcha is a little uncomfortable for the user of the company that I am providing service. I don’t know about legality, but I’m investing in the reverse engineering of the android APK. So finish post here

Browser other questions tagged

You are not signed in. Login or sign up in order to post.