Taking table elements with htmlagilitypack

Asked

Viewed 516 times

1

I have this structure rethought over and over again.

1st Table

<table>
<tbody>
<tr>
<th>titulo</th>
</tr>
</tbody>
</table>

2nd Table

<table>
<tbody>
<tr>
<th>Texto</th>
<th>Texto</th>
<th>Texto</th>
<th>Texto</th>
</tr>
</tbody>
</table>

There are several following this pattern.

How do I pass them to an array and a list to get the values of each ?

  • I didn’t quite understand the question. Do you want to pass that content that’s inside the TH element to a list ? Is that it ? You have one or more tables in your html ?

1 answer

1

You can use a tool to help Htmlagility that would be the Fizzler!! it has the same purpose as the Htmlagility however in it you can do "query’s" in your object to catch the desired element, you can download the Fizzler on nuget even,

and it would work like this

WebClient wc = new WebClient();
wc.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36");
string pag = wc.DownloadString("pagina de onde você quer pegar a informação");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml2(pag);
var tabela = doc.DocumentNode.QuerySelectorAll("tags do queryselector").ToList();

you can search a little more about the tags of the queryselector more the basic would be

"#" = ID of the element,

"." = element class,

and the tags of s elements in html normally!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.