Crawler to log in to the site of nota fiscal paulista

Asked

Viewed 639 times

1

What I got so far is this:

package br.com.crawler;

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.net.URL;

import javax.net.ssl.HttpsURLConnection;

public class Crawler {

    private final String USER_AGENT = "Mozilla/5.0";

    public static void main(String[] args) throws Exception {

        Crawler http = new Crawler();

        System.out.println("\nTesting 1 - Enviar request via POST");
        http.sendPost();

    }

    // HTTP POST request
    private void sendPost() throws Exception {

        String url = "https://www.nfp.fazenda.sp.gov.br/login.aspx";
        URL obj = new URL(url);
        HttpsURLConnection con = (HttpsURLConnection) obj.openConnection();

        //add reuqest header
        con.setRequestMethod("POST");
        con.setRequestProperty("User-Agent", USER_AGENT);
        con.setRequestProperty("Accept-Language", "en-US,en;q=0.5");

             String urlParameters = "__EVENTVALIDATION=&"
            + "__EVENTARGUMENT=&"
            + "__VIEWSTATE=/wEPDwUKMTMwMTM2MTg2MA9kFgJmD2QWAgIBD2QWCgIDDxYCHgVjbGFzcwUYYmFycmFBY2Vzc2liaWxpZGFkZUxvZ2luFgQCAQ8WAh4HVmlzaWJsZWhkAgMPFgIfAWdkAgQPFgIfAWhkAgYPDxYCHgRUZXh0BRROb3RhIEZpc2NhbCBQYXVsaXN0YWRkAggPFgIfAWhkAgoPZBYCZg9kFgJmD2QWBAIJDw8WAh8BZ2RkAg8PZBYCAgUPZBYCAgEPZBYCAgEPDxYEHghUYWJJbmRleAENAB4JTWF4TGVuZ3RoAgRkZBgBBR5fX0NvbnRyb2xzUmVxdWlyZVBvc3RCYWNrS2V5X18WCAUtY3RsMDAkQ29udGV1ZG9QYWdpbmEkTG9naW4xJHJkQnRuQ29udHJpYnVpbnRlBTBjdGwwMCRDb250ZXVkb1BhZ2luYSRMb2dpbjEkcmRCdG5OYW9Db250cmlidWludGUFLWN0bDAwJENvbnRldWRvUGFnaW5hJExvZ2luMSRyZEJ0bkNvbnRhYmlsaXN0YQUrY3RsMDAkQ29udGV1ZG9QYWdpbmEkTG9naW4xJHJkQnRuRmF6ZW5kYXJpbwUnY3RsMDAkQ29udGV1ZG9QYWdpbmEkTG9naW4xJHJkQnRuUHJvY29uBTZjdGwwMCRDb250ZXVkb1BhZ2luYSRMb2dpbjEkcmRCdG5BZHZvZ2Fkb1JlcHJlc2VudGFudGUFL2N0bDAwJENvbnRldWRvUGFnaW5hJExvZ2luMSRpbWdCdG5BY2Vzc29DZXJ0Q1BGBTBjdGwwMCRDb250ZXVkb1BhZ2luYSRMb2dpbjEkaW1nQnRuQWNlc3NvQ2VydENOUEo=&"
            + "ctl00$ConteudoPagina$Login1$rblTipo=rdBtnNaoContribuinte&"
            + "ConteudoPagina$Login1$UserName="+user+"&"
            + "ctl00$ConteudoPagina$Login1$Password="+password;

        // Send post request
        con.setDoOutput(true);
        DataOutputStream wr = new DataOutputStream(con.getOutputStream());
        wr.writeBytes(urlParameters);
        wr.flush();
        wr.close();

        int responseCode = con.getResponseCode();
        System.out.println("Enviando 'POST' request para a URL : " + url);
        System.out.println("Parâmetros parameters : " + urlParameters);
        System.out.println("Response Code: " + responseCode);

        BufferedReader in = new BufferedReader(
                new InputStreamReader(con.getInputStream()));
        String inputLine;
        StringBuffer response = new StringBuffer();

        while ((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }
        in.close();

        //print result
        System.out.println(response.toString());

    }

}

My question is I don’t know what parameters to pass.

  • In case the parameters you do not know what would be in the String urlParameters = "param1=valor1&param2=valor2";?

  • so, I put this as an example only, but it should have something like login and password and a few others

  • I’m passing these parameters --- ctl00$ConteudoPagina$Login1$rblTipo=rdBtnContribuinte&ctl00$ConteudoPagina$Login1$rblTipo=rdBtnNaoContribuinte$ConteudoPagina$Login1$rblTipo=rdBtnContabilista$ConteudoPagina$Login1$rblTipo=rdBtnFazendario$ConteudoPagina$Login1$rblTipo=rdBtnProcon$ConteudoPagina$Login1$rblTipo=rdBtnAdvogadoRepresentante$ctl00$ConteudoPagina$Login1$UserName=valor2$ctl00$ConteudoPagina$Login1$Password=valor2

  • but it seems that being radio button need to command which of them was checked, eh in that I’m locking

  • Could you edit the question to put these details? It’s easier to understand than to see in the comments.

  • I edited, that user and passward I am going to receive from the user.

  • It seems to me that several are missing & in its parameters.

  • I got what you said, executed but still not accepted the parameters

  • Lacked a & after the user.

  • Once the name ctl00$ConteudoPagina$Login1$rblTipo makes it seem that this is a radio button, I think you should just put one of them, and not all.

  • so, this radio button question that I’m not knowing how to pass, I researched some things seems to have to pass which one of them was checked but tried a few ways and it didn’t work, maybe if I pass only one of them works, I will test vlw

  • I added some inputs that were like Hidden in the parameters, but it didn’t work for him to send me to a screen saying that a request failed

  • If you add one con.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); something changes?

  • I added, but nothing changed sent me to this page https://www.nfp.fazenda.sp.gov.br/Erro.aspx

Show 9 more comments

1 answer

1


Try it like this:

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.ProtocolException;
import java.net.URL;
import java.util.stream.IntStream;

import javax.net.ssl.HttpsURLConnection;

public class Crawler {

    private static final String USER_AGENT = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36";

    private final String user;
    private final String password;
    private final TipoLogin tipo;

    public static enum TipoLogin {
        CONTRIBUINTE_ICMS("rdBtnContribuinte"),
        CONSUMIDOR("rdBtnNaoContribuinte"),
        CONTABILISTA("rdBtnContabilista"),
        FAZENDARIO("rdBtnFazendario"),
        PROCON("rdBtnProcon"),
        REPRESENTANTE_CONTRIBUINTE("rdBtnAdvogadoRepresentante");

        private final String radio;

        private TipoLogin(String radio) {
            this.radio = radio;
        }

        public String getRadio() {
            return radio;
        }
    }

    public static void main(String[] args) throws IOException {
        Crawler http = new Crawler("12345678901", "$enh4", TipoLogin.CONTRIBUINTE_ICMS);
        http.sendPost();
    }

    public Crawler(String user, String password, TipoLogin tipo) {
        this.user = user;
        this.password = password;
        this.tipo = tipo;
    }

    // HTTP POST request
    private void sendPost() throws IOException {
        URL url;
        try {
            url = new URL("https://www.nfp.fazenda.sp.gov.br/login.aspx");
        } catch (MalformedURLException e) {
            throw new AssertionError(e);
        }

        HttpsURLConnection get = (HttpsURLConnection) url.openConnection();
        get.setRequestProperty("User-Agent", USER_AGENT);
        get.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
        get.getResponseCode();
        String page = download(get.getInputStream());

        HttpsURLConnection con = (HttpsURLConnection) url.openConnection();

        try {
            con.setRequestMethod("POST");
        } catch (ProtocolException e) {
            throw new AssertionError(e);
        }
        con.setRequestProperty("User-Agent", USER_AGENT);
        con.setRequestProperty("Accept-Language", "en-US,en;q=0.5");

        String urlParameters = "__EVENTTARGET=" + buscarCampo(page, "__EVENTTARGET")
                + "&__EVENTARGUMENT=" + buscarCampo(page, "__EVENTARGUMENT")
                + "&__VIEWSTATE=" + buscarCampo(page, "__VIEWSTATE")
                + "&__EVENTVALIDATION=" + buscarCampo(page, "__EVENTVALIDATION")
                + "&ctl00$ddlTipoUsuario=#rdBtnNaoContribuinte"
                + "&ctl00$UserNameAcessivel="
                + "&ctl00$PasswordAcessivel="
                + "&ctl00$ConteudoPagina$Login1$rblTipo=" + tipo.getRadio()
                + "&ctl00$ConteudoPagina$Login1$UserName=" + escapeURI(user)
                + "&ctl00$ConteudoPagina$Login1$Password=" + escapeURI(password);

        System.out.println("Parâmetros parameters : " + urlParameters);

        // Send post request
        con.setDoOutput(true);
        try (DataOutputStream wr = new DataOutputStream(con.getOutputStream())) {
            wr.writeBytes(urlParameters);
            wr.flush();
        }

        int responseCode = con.getResponseCode();
        System.out.println("Enviando 'POST' request para a URL : " + url);
        System.out.println("Response Code: " + responseCode);

        String response = download(con.getInputStream());

        //print result
        System.out.println(response);
    }

    private static String download(InputStream is) throws IOException {
        StringBuilder response = new StringBuilder(1024);
        try (BufferedReader in = new BufferedReader(new InputStreamReader(is))) {
            String inputLine;

            while ((inputLine = in.readLine()) != null) {
                response.append(inputLine);
            }
        }
        return response.toString();
    }

    private static String buscarCampo(String html, String campo) {
        String input = "<input type=\"hidden\" name=\"" + campo + "\" id=\"" + campo + "\" value=\"";
        int a = html.indexOf(input);
        if (a == -1) return "";
        int b = html.indexOf('\"', a + input.length());
        return html.substring(a + input.length(), b);
    }

    private static final String[] HEX = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F"};

    private static String escapeURI(byte c) {
        boolean ok = (c >= 'A' && c <= 'Z')
                || (c >= 'a' && c <= 'z')
                || (c >= '0' && c <= '9')
                || c == '-' || c == '.' || c == '_' || c == '~'
                || c == '$' || c == '#';
        return ok ? String.valueOf((char) c) : "%" + HEX[c >>> 4] + HEX[c & 0xF];
    }

    public static String escapeURI(String in) {
        StringBuilder sb = new StringBuilder(in.length() * 2);
        try {
            byte[] bytes = in.getBytes("UTF-8");
            IntStream.range(0, bytes.length).mapToObj(i -> escapeURI(bytes[i])).forEach(sb::append);
        } catch (UnsupportedEncodingException e) {
            throw new AssertionError(e);
        }
        return sb.toString();
    }
}

These fields __EVENTTARGET, __EVENTARGUMENT, __VIEWSTATE and __EVENTVALIDATION are problematic. Possibly these fields may have random values defined by the server it expects to read back. Because of this, first I do a GET on the page to get the value of these fields and then POST with the value of all fields.

Note the fields ctl00$ddlTipoUsuario, ctl00$UserNameAcessivel and ctl00$PasswordAcessivel. These fields are at the top of the form and are sent along with the request, even if they are not needed.

At the end, the form fields that interest you (whose values are passed in the call to the constructor within the main(String[])) are these:

  • The ctl00$ConteudoPagina$Login1$rblTipo which corresponds to radios Buttons, and may be rdBtnContribuinte, rdBtnNaoContribuinte, rdBtnContabilista, rdBtnFazendario, rdBtnProcon or rdBtnAdvogadoRepresentante.

  • The ctl00$ConteudoPagina$Login1$UserName which is the user name.

  • The ctl00$ConteudoPagina$Login1$Password which is the password.

Note that I am using a encoding user and password to "escape" special characters.

There are probably more things I left behind. Let me know in the comments if you get it or not.

  • @Roque https://docs.oracle.com/javase/8/docs/api/java/util/stream/IntStream.html#range-int-int-int-

  • I passed the user and password parameter, but redirected to the error page as well

  • Try using some browser plugin like webdeveloper to see exactly what the browser sends in your case (both in the request body and in the headers). I personally can’t test because I don’t have login and password.

  • 1

    quiet, I’ll try to figure out how he is passing this request, the java part is already ready eh soh discover the parameters same but already helped very vlw guy

  • guy managed to pass the parameters, but ta asking captcha, has a way to pass the request to disable the captcha?

  • @I don’t think so, because if there was a way to disable the captcha, then what good would the captcha do anyway? The best thing to do is try to use some captcha breaker.

  • I got the parameters without enabling the captcha, but I still have a problem that is following, when I play in the browser the url and the parameters it login right, but when I execute my code it continues in the login page - http://answall.com/questions/73293/enviar-requisi%C3%A7%C3%A3o-via-get

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.