Import HTML table based on a URL and fill a datatable

Asked

Viewed 268 times

0

string htmlCode = "";

    using (WebClient client = new WebClient())
    {
        client.Headers.Add(HttpRequestHeader.UserAgent, "AvoidError");
        htmlCode = client.DownloadString("http://www.site.html");
    }

 HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
 doc.LoadHtml(htmlCode);
 var headers = doc.DocumentNode.SelectNodes("//tr/th");
 DataTable table = new DataTable();
 foreach (HtmlNode header in headers)
 table.Columns.Add(header.InnerText); 
 foreach (var row in doc.DocumentNode.SelectNodes("//tr[td]")) 
 table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());

This example was posted in another topic on Soen. I don’t know if the data table is already filled in here and I don’t know what the table field names are.

  • Could explain your problem better?

  • Yes, I want to import an HTML table from a site, based on the url and preempt a datatable. Once filled, I want to use my DTO (transference object) from my application to preempt another control to show the data on the screen. I don’t know if so far in this code the datatable is filled and I don’t know how to get the names of the table fields. This is the first time I use this approach. I always use data source like Sql to get the data.

  • Did you ever test this code? Apparently it is already generating the DataTable.

  • yes, I have tested it now and the datatable contains the rows and columns, but I cannot get the name of the table fields to implement the rest.

  • Could you post the site you are using to retrieve this data? That is, if you have no problem posting

1 answer

1


I didn’t quite understand your problem, but I will modify the code a little to try to explain better.

First, the link you posted in the comments contains more than one table, so let’s get the table you want by id.

Your final code will stay this way:

           string htmlCode = "";

            using (WebClient client = new WebClient())
            {
                client.Headers.Add(HttpRequestHeader.UserAgent, "AvoidError");
                htmlCode = client.DownloadString("http://www.codiceinverso.it/directory-cognomi/cadore.html");
            }

            HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
            doc.LoadHtml(htmlCode);
            DataTable table = new DataTable();

            //Seleciona todas as colunas
            var cabecalhos = doc.DocumentNode.SelectNodes("//table[@id='cognomi']/thead/tr/th");
            foreach (HtmlNode col in cabecalhos)
            {
                //Adiciona as colunas
                table.Columns.Add(col.InnerText);
            }

            //Seleciona todas as linhas
            var linhas = doc.DocumentNode.SelectNodes("//table[@id='cognomi']/tbody/tr[td]");
            foreach (var row in linhas)
            {
                //Adiciona todas as linhas
                table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());
            }

The DataTatble generated will have 2 columns and 10 rows, as can be seen in the images below:

Columns:

inserir a descrição da imagem aqui

Lines:

inserir a descrição da imagem aqui

  • Aham, that’s it. Thanks buddy.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.