How do I login to a system via an HTTP request?

Asked

Viewed 2,269 times

3

I’m performing an HTTPS get request for the following address, my initial intention is to receive the html data from the page. I followed the tutorial of Mkyong, but I get the answer code 302, I don’t know what I’m doing wrong, I added the headers necessary, but still I’m not getting a positive result. Follows the code:

public class HttpUrlConnectionUFAC {

    private List<String> cookies;
    private HttpsURLConnection conn;

    private final String USER_AGENT = "Mozilla/5.0";

    public static void main(String[] args) {
        String url = "https://portal.ufac.br/aluno/login/";

        HttpUrlConnectionUFAC http = new HttpUrlConnectionUFAC();

        // make sure cookies is turn on
        CookieHandler.setDefault(new CookieManager());

        // 1. Send a "GET" request, so that you can extract the form's data.
        String page = null;
        try {
            page = http.getPageContent(url);
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println("Resposta:\n"+page);
    }

    private String getPageContent(String url) throws Exception {

        URL obj = new URL(url);
        conn = (HttpsURLConnection) obj.openConnection();

        // default is GET
        conn.setRequestMethod("GET");

        // Acts like a browser
        conn.setUseCaches(false);
        conn.setRequestProperty("Remote Address",
                "200.129.173.7:443");
        conn.setRequestProperty("Accept",
                "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        conn.setRequestProperty("Accept-Encoding",
                "gzip, deflate, sdch");
        conn.setRequestProperty("Accept-Language", "pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4");
        conn.setRequestProperty("Cache-Control", "max-age=0");
        conn.setRequestProperty("Connection", "keep-alive");
        conn.setRequestProperty("Host", "portal.ufac.br");
        conn.setRequestProperty("User-Agent", USER_AGENT);
        conn.setRequestProperty("Referer", "https://portal.ufac.br/aluno/login.action?error=");

        if (cookies != null) {
            for (String cookie : this.cookies) {
                conn.addRequestProperty("Cookie", cookie.split(";", 1)[0]);
            }
        }
        int responseCode = conn.getResponseCode();
        System.out.println("\nSending 'GET' request to URL : " + url);
        System.out.println("Response Code : " + responseCode);

        BufferedReader in = 
                new BufferedReader(new InputStreamReader(conn.getInputStream()));
        String inputLine;
        StringBuffer response = new StringBuffer();

        while ((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }
        in.close();

        // Get the response cookies
        setCookies(conn.getHeaderFields().get("Set-Cookie"));

        return response.toString();

    }



    public List<String> getCookies() {
        return cookies;
    }

    public void setCookies(List<String> cookies) {
        this.cookies = cookies;
    }
}

Exit:

Sending 'GET' request to URL : https://portal.ufac.br/aluno/login/
Response Code : 302
Resposta:
‹

What am I doing wrong? How can I properly perform this request?

  • @re22 my initial idea is to recover the html code, then perform the authentication on this site, but I can’t even recover the code.

  • 2

    Print the answer headers. Must have a Location header with the request address (additionally I would boot most of the request headers).

2 answers

3


Problem solved! I used the tool Htmlunit suggested by @re22 and managed to retrieve the information from the site.

First I created a Webclient object that simulates the Chrome browser, then I created a Cookiemanager to manage the session data, to be able to perform several requests after authentication on the site.

final WebClient webClient = new WebClient(BrowserVersion.CHROME);
CookieManager cookieMan = new CookieManager();
cookieMan = webClient.getCookieManager();
cookieMan.setCookiesEnabled(true);

I used these two methods to disable warning messages related to html documents received when performing requests:

webClient.getOptions().setJavaScriptEnabled(false);
webClient.getOptions().setCssEnabled(false);

In this excerpt I capture the login page, your forms, in case only 1, concatenating them into a single html form, where I add the information to log in the fields j_password and j_username:

    pagina = webClient.getPage("https://portal.ufac.br/aluno/login.action");

    List<HtmlForm> formularios = pagina.getForms();
    HtmlForm formulario = null;

    for (HtmlForm htmlForm : formularios) {
        formulario = htmlForm;
    }
    HtmlTextInput usuario = formulario.getInputByName("j_username");
    HtmlPasswordInput senha = formulario.getInputByName("j_password");              
    usuario.setValueAttribute("******");
    senha.setValueAttribute("******");

Finally create a response html page simulating a click on a button, then using this html I make a web request to get the session data, being authenticated is stored by Cookiemanager two sessions, otherwise a single session will be stored. Then I made a request to recover the content of the user profile page after authentication on the site.

final HtmlPage paginaResposta = (HtmlPage) formulario.getInputByValue("Entrar").click();
paginaResposta.getWebResponse();
String result = webClient.getPage("https://portal.ufac.br/aluno/aluno/perfil/perfil.action").getWebResponse().getContentAsString();

Below is the complete implementation:

    //Cria o cliente
    final WebClient webClient = new WebClient(BrowserVersion.CHROME);
    //O CookieManager vai gerenciar os dados da sessão
    CookieManager cookieMan = new CookieManager();
    cookieMan = webClient.getCookieManager();
    cookieMan.setCookiesEnabled(true);

    java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
    java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);

    HtmlPage pagina;
    try {
        pagina = webClient.getPage("https://portal.ufac.br/aluno/login.action");


        List<HtmlForm> formularios = pagina.getForms();
        HtmlForm formulario = null;

        for (HtmlForm htmlForm : formularios) {
            formulario = htmlForm;
        }

        HtmlTextInput usuario = formulario.getInputByName("j_username");
        HtmlPasswordInput senha = formulario.getInputByName("j_password");              
        usuario.setValueAttribute("******");
        senha.setValueAttribute("******");

        final HtmlPage paginaResposta = (HtmlPage) formulario.getInputByValue("Entrar").click();
        paginaResposta.getWebResponse();

        //Navegando para a página de perfil do usuário
        String result = webClient.getPage("https://portal.ufac.br/aluno/aluno/perfil/perfil.action").getWebResponse().getContentAsString();
        System.out.println("RESULT:\n "+ result); 
    } catch (FailingHttpStatusCodeException | IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    System.out.println(cookieMan.getCookies());
  • 1

    If you only need text (not all code), you can do: String texto = webClient.getPage("foo.com").asText();.

2

You can use the Jsoup. To get html code from a page is enough:

Document document =  Jsoup.connect("http://answall.com").get();
System.out.println(document.html()); // html da página

In your case, it seems that you want more than html code. As the user must be authenticated, there should probably be a cookie which keeps the session for the next requests, you can get it like this:

Connection.Response response = Jsoup.connect("https://portal.ufac.br/aluno/")
          .data("j_username", "joaoDaSilva", "j_password", "joao1234")
          .method(Connection.Method.POST)
          .execute();

String theHtml = response.parse().html(); // html
Map<String, String> theCookies = response.cookies(); // obtém os cookies

And in the next requisitions:

Document randomPage = Jsoup.connect("https://portal.ufac/foo")
         .cookies(theCookies)
         .get();

System.out.println(randomPage.html()); // html da página.

If you need something more complete, an alternative is Htmlunit. In that reply there is a minimum explanation and an example of how to access, fill and submit a login form on a web page.

  • Would the "foo" inside the Jsoup.connect string("https://portal.ufac/foo") be the location of the index page after login? I tried that way and only received the code from the authentication page, as if I had not been authenticated on the site...

  • I keep getting the contents of the login page... Analyzing the content I found this: <div id="no-script"> Javascript is disabled or not supported by your browser. To use the Student Portal properly, please enable Javascript by changing the browser options and try again. </div> .. Could that be the problem? If so, how do I activate the JS in this request? Thank you

  • have any idea how I can do this?

  • I was able to successfully perform the HTTP request using Htmlunit, I edited my question with the solution I used using the beast you suggested. Thank you very much.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.