Invalid character in HTTP Request C#

Asked

Viewed 669 times

0

I need to develop a Webapi that you pass a Query and it returns the google search with title and url. I am using C# and . NET, when I do the HTTP request to receive page data, it is returning the special characters as a "?" Someone has an idea how to fix this?

Another question is why when I make a request (get) by Postman, it returns A lot of data, and with C# (Webrequest) much less?String retornada pelo WebRequest C#

Follow the code:

            string sURL = "https://www.google.com/search?q="+this.Query+"&sourceid=chrome&ie=UTF-8";
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(sURL);

        HttpWebResponse response = (HttpWebResponse)request.GetResponse();
        string myResponse = "";
        using (System.IO.StreamReader sr = new System.IO.StreamReader(response.GetResponseStream()))
        {
            myResponse = sr.ReadToEnd();
        }
        return myResponse;

I did some tests with other websites, and they perfectly return the characters, apparently the problem is with Google (However by Postman returns perfectly)

  • Where is your code?

  • I just edited the question

  • Try passing the UTF-8 encoding as the second parameter of the Streamreader, I think it solves the character problem.

  • @Francis unfortunately did not resolve.

  • This code you passed works well, congratulations. But still missing the code that sends the result you found for the page because that’s where the problem is the way the HTML is being passed, the image is being passed with a text .

  • 1

    @Augustovasques this, I’m passing as text, then filter this text to get only the titles and urls (the part that interests me). However, what is happening, are these invalid characters that apparently come directly from the request, because if I make requests to sites like globe.com and Uol.com.br, the characters arrive perfect.

  • 1

    Uf8 and Htmlencode / Htmldecode

  • Unfortunately it didn’t work either.

  • It is not the case to make then one String.Replace(char, char)? return myResponse = myResponse.Replace('\ufffd',' ')

  • The problem is that all characters in ufffd are characters with some type of accent, whatever ~ ` ^

  • @Felipechiarotti, view my answer?

Show 6 more comments

2 answers

0

Try using the encoding ISO-8859-1, I did a test with code presented as test and returned the special characters.

using (System.IO.StreamReader sr = new System.IO.StreamReader(response.GetResponseStream(), Encoding.GetEncoding("ISO-8859-1")))

The image demonstrates the response of the request and its special characters print com resposta em arquivo de request

0

The accent you resolve with the CultureInfo in his StreamReader

string Query = "teste";

string sURL = "https://www.google.com/search?q=" + Query + "&sourceid=chrome&ie=UTF-8";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(sURL);

HttpWebResponse response = (HttpWebResponse)request.GetResponse();
string myResponse = "";

var culture = CultureInfo.GetCultureInfo("pt-BR"); 
using (System.IO.StreamReader sr = new System.IO.StreamReader(response.GetResponseStream(),
        Encoding.GetEncoding(culture.TextInfo.ANSICodePage),false))
{

    myResponse = sr.ReadToEnd();

}

Now the other characters as for example \u003E representing the > are in UNICODE format because they are presented inside strings, are declared this way to avoid breaks in special characters and must be treated in your reading of these contents in particular.

  • Good morning Leandro, this way I’m getting the following error (Exception): System.Notsupportedexception: 'No data is available for encoding 1252. For information on Defining a custom encoding, see the Documentation for the Encoding.Registerprovider method.'

  • the solution is . net core?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.