Web Scraping or Web Crawler isolate Node

Asked

Viewed 98 times

0

Please, I’m trying to retrieve the following information:

"value bra": (<span class="value bra">3,666</span>)

<div class="ticker-financial-market" initiated="true">
   <div class="ticker-slide" style="width: 1446px;">
   	<section class="currencies">
	<div class="info">
		<a href="https://economia.uol.com.br/cotacoes/cambio/dolar-comercial-estados-unidos/">
		     <span class="name"> Dólar com.</span>
		     <div class="numbers">
			     <span class="data neg">-0,19</span>
			     <span class="value bra">3,666</span>
		     </div>
		</a>
	</div>
        </section>
   </div>
</div>

I’m trying with a very interesting emulator on the internet, namely: .Net Fiddle

But I’m not getting "ISOLATE" the NODE... ... no error return, the best I could do was this:

foreach(HtmlNode link in doc.DocumentNode.SelectNodes("//section[@class='currencies']"))

That way returned values, but it’s NOT what I want:

foreach(HtmlNode link in doc.DocumentNode.SelectNodes("//div/div/section/div"))

The complete code is:

using System;
using System.IO;    
using System.Xml;    
using System.Linq;    
using HtmlAgilityPack;

public class Program
{    
    public static void Main()
    {
        GetTftd();  
    }

    private static void GetTftd() {
        var url = @"https://economia.uol.com.br/cotacoes/cambio/dolar-comercial-estados-unidos/";
        var doc = new HtmlWeb().Load(url);

        foreach(HtmlNode link in doc.DocumentNode.SelectNodes("//div[@class=\"ticker-financial-market\"]/div[@class=\"ticker-slide\"]/section[@class=\"currencies\"]"))
        {
            Console.WriteLine(link.OuterHtml);
        }   
  }

I appreciate any help.

  • 1

    Ever tried to use //div[@class='value']?

  • 1

    Appears: Untreated exception System.Argumentnullexception: 'Value cannot be null. Arg_paramname_name'

  • 1

    Oh wait, I gave the wrong information. It would be //span[@class='value']

  • 1

    Also not... same error for: var Headernames = doc.DocumentNode.Selectnodes("//span[@class='value']"). Tolist();

  • 1

    Run-time Exception (line 18): Object Reference not set to an instance of an Object. Stack Trace: [System.Nullreferenceexception: Object Reference not set to an instance of an Object. ] at Program.Gettftd() :line 18 at Program.Main() :line 11

  • Please Sorack, if you want to test, you can use the site: "https://dotnetfiddle.net/8rsuym", just change the URL and NODE.

  • Change URL to where?

  • Switch to: (https://economia.uol.com.br/) ... I want to take the Commercial Dollar. The emulator ". Net Fiddle" already has the example ready, you just need to point to the site that wants to rescue the information.

  • You checked if this value is not filled in by Javascript?

  • Sorack, from what I understand reading the site: "https://html-agility-pack.net/select-nodes" you need to pass this value.

Show 5 more comments

1 answer

1


I couldn’t get on that site either, everything on <section class="currencies"></section> does not appear.. try another site:

var url = @"https://themoneyconverter.com/USD/BRL.aspx";
var doc = new HtmlWeb().Load(url);

var value = doc.DocumentNode.SelectSingleNode("//*[@id='cc-ratebox']");

Console.WriteLine(value.InnerText);
  • Vik, you need to get the right NODE (/html/body/Section[1]/div[2]/div/div[1]/div[2]/H3/a[3]) with this NODE works! I couldn’t catch it, but a VBMANIA user (KERPLUNK) picked it up and passed it to me. []’s, Fabio I.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.