How does Htmlunit work?

Asked

Viewed 2,487 times

3

Someone has some example of login of a web system communicating with another, where I will send the user and password to another site and this site will authenticate and give me a feedback if the user and password are correct or not, using Htmlunit in java.

Example of what I will do: I have a web system that to log in I want to use another system that the client, which is the same, preferred not to give access to the database of that other system, so I have a login page, where the user puts the login and password and through the Htmlunit, I send to the other system, in the form of a request, this information, and so I get a javascript page, as a response.

1 answer

4


In summary, the Htmlunit has an API that allows Java applications to perform the same actions that a user would perform in the browser, some examples include invoking a web page, clicking buttons and/or links, filling in forms...

Roughly it is a browser without the graphical interface the persons responsible for the project so-called as Features and other information can be found on the project page.

Example

Consider a page access http://meusiteficticio.com that has a form on the page with this structure:

<form id='form-login' action='/login' method='post'>
   <input name='user' type='text' placeholder='Nome de usuário'/>
   <input name='pass' type='password' placeholder='Senha'/>
   <input type='submit' value='entrar'/>
</form>

Through the browser, the user would enter a username and password in the appropriate fields, then click the button to submit the form. We will do the same but within the application.

They implemented (v2.8) and made public (v2.11) the methods querySelector and querySelectorAll that work similar to the functions that exist in Javascript. To get the same result of the previous code with these methods the code can look like this:

// Obtém a página de login.
HtmlPage paginaDeLogin = new WebClient(BrowserVersion.BEST_SUPPORTED)
                             .getPage("http://meusiteficticio.com");

// Obtém os elementos do formulário.
HtmlTextInput inputNomeDeUsuario = paginaDeLogin.querySelector("input[name='user']");
HtmlPasswordInput inputSenha = paginaDeLogin.querySelector("input[name='pass']");
HtmlSubmitInput botaoEnviar = paginaDeLogin.querySelector("#form-login > input[type='submit']");

// Define o valor do atributo 'value' dos inputs.
inputNomeDeUsuario.setValueAttribute("joao");
inputSenha.setValueAttribute("joao1234");

// Simula o "click" no botão de submit e aguarda retorno
HtmlPage paginaAposOLogin = botaoEnviar.click();

// Mostra o código html da página
System.out.println(paginaAposOLogin.asXml());

If you are using an older version that does not support querySelector, you will first have to get the form and then go picking up the inputs through the method getInputByName:

// Simulando um navegador Chrome.
WebClient client = new WebClient(BrowserVersion.CHROME);

// Obtém a página.
HtmlPage paginaDeLogin = client.getPage("http://meusiteficticio.com");

// Obtém o formulário de login pelo atributo "id" no html.
// O segundo parâmetro é para aceitar case-sensitive
// e.g "FoRm-LoGiN" também encontraria o formulário.
HtmlForm formularioDeLogin = paginaDeLogin.getElementById("form-login", true);

// Obtém o inputs (do formulário) pelo atributo "name":
HtmlTextInput inputNomeDeUsuario = formularioDeLogin.getInputByName("user");
HtmlPasswordInput inputSenha = formularioDeLogin.getInputByName("pass");

// O "botão" de submit não possui name, id, class, etc.
// Então uma forma de obtê-lo é pelo "value='entrar".
HtmlSubmitInput botaoEnviar = formularioDeLogin.getInputByValue("entrar");

// Insere os valores nos campos de nome de usuário e senha
// (como se estivesse digitando pelo navegador)
inputNomeDeUsuario.setValueAttribute("joao");
inputSenha.setValueAttribute("joao1234");

// Simula o "click" no botão de submit e aguarda retorno
HtmlPage paginaAposOLogin = botaoEnviar.click();

// Mostra o código html da página
System.out.println(paginaAposOLogin.getWebResponse().getContentAsString());

Be legal, handle the exceptions. Trying to insert (or even manipulate) a value into a non-existent input will launch a NullPointerException.

Keeping the cookies

If you need to keep the cookies for use in the next requests you must define a CookieManager for your "browser" WebClient.

WebClient client = new WebClient(BrowserVersion.FIREFOX_24);
CookieManager cookieManager = client.getCookieManager();
cookieManager.setCookiesEnabled(true);
client.setCookieManager(cookieManager);

HtmlPage fb = client.getPage("https://facebook.com");

Disabling Warnings and Warnings

Htmlunit will display all warnings that invalidate the Html document, for example, obsolete attributes, errors in Javascript code and CSS as seen in this image:

Exemplo

You can turn off these alerts by setting Htmlunit logger level as OFF:

Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);

Browser other questions tagged

You are not signed in. Login or sign up in order to post.