Browse answer tags from a C#site

Asked

Viewed 505 times

1

I’m performing a query on google by C# and I need to get back from the query the Query Title and the Link of the queries returned.

I got my return as follows:

public class GoogleSearch
{
    private string _TituloPesquisa;
    private string _LinkPesquisa;
    private List<GoogleResultado> _GoogleResultadoList;

    static AutoResetEvent aguardarDocumentCompleted = new AutoResetEvent(true);

    public GoogleSearch()
    {
        _GoogleResultadoList = new List<GoogleResultado>();
    }

    public string GetResultJson 
    { 
        get
        {
            return _GoogleResultadoList.ToJsonSerialization<GoogleResultado>();
        }
    }

    public IEnumerable<GoogleResultado> GetListResults
    {
        get
        {
            return _GoogleResultadoList.ToList();
        }
    }

    public async Task Pesquisar(string textoConsulta)
    {
        await ExecuteSearchAsync(textoConsulta);
    }

    private async Task ExecuteSearchAsync(string textoConsulta)
    {
        string html = await GetHtmlResponse(textoConsulta);

        HtmlDocument documento = new HtmlDocument();
        documento.LoadHtml(html);

        HtmlNodeCollection allElementsWithClassG = documento.DocumentNode.SelectNodes("//div[@class=\"g\"]");

        GoogleResultado resultado;
        HtmlNode link;
        foreach (HtmlNode x in allElementsWithClassG)
        {
            link = x.Descendants("a").FirstOrDefault();
            resultado = new GoogleResultado();
            resultado.Titulo = link.InnerText;
            resultado.Url = link.Attributes["href"].Value.Replace("/url?q=", "");

            _GoogleResultadoList.Add(resultado);                
        }            
    }

    private static async Task<string> GetHtmlResponse(string textoConsulta)
    {
        HttpClient client = new HttpClient();
        HttpResponseMessage response = await client.GetAsync("https://www.google.com/search?num=100&q=" + textoConsulta);

        var streamRetorno = new StreamReader(await response.Content.ReadAsStreamAsync());
        return streamRetorno.ReadToEnd();             
    }                                           
}           

Execution call:

    private void btnPesquisar_Click(object sender, EventArgs e)
    {
        if (string.IsNullOrEmpty(txtPesquisa.Text))
            if (MessageBox.Show("Texto para pesquisa não informado.", "Atenção", MessageBoxButtons.OK, MessageBoxIcon.Information) == System.Windows.Forms.DialogResult.OK)
                return;

        google.Pesquisar(txtPesquisa.Text);
        var stringJson = google.GetResultJson;
        var listresult = google.GetListResults;
    }

I have all the search results there, but as I go through the tags to get the information, someone can help me?

Return - this is the return string, but this is only the beginning, because as it is a google search the string is very large. Retorno

  • That’s what I gave you yesterday ?!

  • How is the kind of return?

  • @Rovannlinhalis, it didn’t work because when using Webbrowser the Document is generated in the asynchronous event, and I need the result before.

  • you can continue using namespace System.Windows.Forms ?

  • @Rovannlinhalis, yes, I can, with that routine everything worked out, I only had to discard because the event of the documentCompleted is asynchronous, and I can’t pause Navigate() and soon after working with Document, it takes time to load and when loaded the method has already been executed, I have tried with Thread.Sleep Task.Delay no success has been achieved.

  • put the answer, see if it helps you

  • still on the Webbrowser, you can put a flag and a timeout to wait for the documentCompleted, I did it once and it worked well

Show 2 more comments

1 answer

2


You will need the HtmlAgilityPack which can be added as a reference in your project by Nuget:

Once that’s done, we follow the same logic as Webbrowser, only changing a few particularities of the component:

Follows the code:

    private static async Task ExecuteSearchAsync(string textoConsulta)
    {
        HttpClient client = new HttpClient();
        HttpResponseMessage response = await client.GetAsync("https://www.google.com/search?num=100&q=" + textoConsulta);

        var streamRetorno = new StreamReader(await response.Content.ReadAsStreamAsync());
        string html =  streamRetorno.ReadToEnd();

        HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
        doc.LoadHtml(html);

        HtmlAgilityPack.HtmlNodeCollection allElementsWithClassG = doc.DocumentNode.SelectNodes("//div[@class=\"g\"]");

        List<Resultados> resultados = new List<Resultados>();
        Resultados r;
        HtmlAgilityPack.HtmlNode link;
        foreach (HtmlAgilityPack.HtmlNode x in allElementsWithClassG)
        {
            link = x.Descendants("a").FirstOrDefault();
            r = new Resultados();
            r.Titulo = link.InnerText;
            r.Url = link.Attributes["href"].Value.Replace("/url?q=", "");
            resultados.Add(r);
        }

        int cout = resultados.Count; //Sua List com todos os resultados da pesquisa.
    }


    public class Resultados
    {
        public string Url { get; set; }
        public string Titulo { get; set; }
    }
  • The System stops here. Htmldocument doc = new Webbrowser().Document.Opennew(true);

  • okay, testing...

  • edited, I think it will work =] and you can still discard the System.Windows.Forms

  • @Ronovann, I am trying to install Htmlagilitypack and it is giving this error. 'Htmlagilitypack' already has a dependency defined for 'System.Net.Http'. (I have already referenced the DLL system.net.http in the project)

  • I installed cloudscribe.Htmlagilitypack and it worked my friend, man, thank you very much for the help, you’re a beast!!!

  • Ronovann, I’m still having trouble with this Async situation, so I’m going to edit and put how I’m doing the call from this API, see if you can help me, because when I’m getting the properties that I have the results from, it’s still null, because this search method is running on another Thread Yet.

  • You wouldn’t have to use the await in the search call?

  • Solved: private async void btnPesquisar_Click(Object Sender, Eventargs and) { if (string.Isnullorempty(txtPesquisa.Text)) if (Messagebox.Show("Text for uninformed search.", "Attention", Messageboxbuttons.OK, Messageboxicon.Information) == System.Windows.Forms.DialogResult.OK) Return; await google. Search(txtPesquisa.Text); var stringJson = google.Getresultjson; var listresult = google.Getlistresults; }

  • wonder, congratulations

Show 4 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.