0
When capturing the page and displaying it on the console, I realized that the HTML was not complete. During the execution I can notice that it returns many elements, but when it finishes the execution the console does not have 1/10 of the content that was seen during the execution.
The problem in question is that I cannot capture an existing element on the page (Return = null), I believe, which is related to the situation mentioned above.
Can someone tell me something that might help me solve the problem?
Code:
public class WebCaptura {
public static void main(String[] args){
String url = "https://g1.globo.com/";
Document doc;
try{
doc = Jsoup.connect(url).userAgent("Mozilla").get();
Element body = doc.getElementsByTag("main").first();
System.out.println(""+doc.getAllElements());
System.out.println("----- END -----");
System.out.println(""+body);
Element news = doc.getElementsByClass("bstn-hl-wrapper").first();
System.out.println("--- Conteudo de interesse ---");
System.out.println(""+news);
}catch (Exception e) {
e.printStackTrace();
}
}
}
Console while running:
Console after execution:
As the element I have interest is on the page:
I removed the console limitation and found the element, but it was with different values in the identifiers, so we could not find the same by Jsoup.
– Lucas