HTML Parser in Xamarin

Asked

Viewed 280 times

2

I am developing an application with Xamarin to make login in a web account using HTTPWebRequest and filter relevant site information to an app.

I’ve already developed the part of doing login, now I need to filter the HTML I have to return.

I need to extract from HTML the information contained in Monday’s top tag <table> HTML, someone would know how to do that?

The great difficulty I have is that tags has not ID so I can’t use this.

the html code of the page I put here for anyone who wants to copy and help me. this html is returned in the Response string.

http://notepad.cc/share/hvptGOlmZQ

need the YYYY and XXXX information contained in the html Tables..

  • You can use the Html Agility Pack. According to this article from Xamarin’s blog, he can do this. About the tags have not Ids, it is possible to extract information from tag through the class or name, it will depend on the HTML structure.

  • I’m trying to use it but it doesn’t seem to be easy as everyone says, actually I don’t find a useful tutorial so far...

  • Edith the question and put the HTML you have. Don’t forget to post what information you want to extract.

  • this is the important part of html I need to remove.. I need to read the values of XXXXX and YYYY

1 answer

1

According to Article Data Extraction in Mobile Apps, the Html Agility Pack can be used to do this.

Htmlagilitypack allows you to parse HTML documents. Unlike of traditional XML parsers, it is able to recover from poorly written content, much like your web browser. In addition of this, the library is mainly multi-platform, so it is easy to create your furniture designs with Xamarin [...]

Assuming that the file foo.html contains 20 tables and you want to select only the twelfth, make the query using //table[12]/tbody/tr/td//text(). Follow an example:

// using HtmlAgilityPack;

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load("arquivoHTML.html");
foreach (HtmlAgilityPack.HtmlNode nodo in 
   doc.DocumentNode.SelectNodes("//table[2]/tbody/tr/td//text()")) {
       MessageBox.Show(nodo.InnerText);
}
  • assuming there are 20 Tables in the whole html as I get the exact table I want?

  • There’s nothing unique about her. I did this with beautifulSoup in python bringing all the Tables to an array and reading all of us inside position table 12

  • ta giving error in foeach .. says "Undefined object reference to an instance of an object."

  • same mistake in the same place

  • 1

    Strange, here in VS2012 works smoothly, you referenced the Html Agility Pack? see if you can compile this code (http://www.filedropper.com/windowsformsapplication6).

  • It worked, I believe the error should be that I am doing doc.load(string) and in this string contains an html ...

  • Actually no, I switched to an html file and still has the same error, but this I solve here, already seen that it works. thanks

  • 1

    The mistake that was giving was that I was missing the path in selectNodes ... now only need to hit the path however and a lot of information on the page... but vlw

  • edited the question , I put the whole html. if you can see me and help me.. I’m kind of stuck in it, and this way you put up the answer even though it works on c# doesn’t work on Xamarin... :/

  • I put the whole html of the page to see if it helps you , I used this that you put there in Xamarin but apparently xamarain works in a different way html Agility Pack

  • @user3896400 Here the path //table/tr/td//text() worked, without tbody, (example). I have no way to test on Xamarin, but from what I researched, it is necessary to make some adjustments, it will depend on the problem you are having. Here there seems to be a version of Htmlagilitypack compatible with the Xamarin, look too that one thread to see if it helps you. I can’t help you much in this situation.

  • @user3896400 Managed to solve the problem? if yes, post the solution to help future visitors with the same problem. =)

  • I couldn’t.... I ended up having to forget a little bit about this project .

Show 9 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.