Error with GET HTTP Special characters

Asked

Viewed 783 times

1

I’m having some trouble doing Gets HTTP.

When the page text has some special character, the answer ends up deforming.

Example:

os participantes deverão:

when the original text was os participantes deverão.

The code I’m using to make this get is as follows:

        try {
        URL url = new URL("***Url***");
        BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
        String strTemp = "";
        while (null != (strTemp = br.readLine())) {
            System.out.println(strTemp);
        }
    } catch (Exception ex) {
        ex.printStackTrace();
    }

Any idea what might be causing this problem?

1 answer

4


These constructions of the type & and ã are called HTML Character entities. They are not errors and probably already came from the original page like this. They are used to represent reserved HTML characters.

Behold in this reply by Soen some ways to replace them with the appropriate characters. The most popular seems to be to use the method StringEscapeUtils.unescapeHtml4() library Apache Commons.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.