Get a specific HTML attribute using Beautifulsoup

Asked

Viewed 164 times

0

I’m trying to capture (catch) an attribute called srcset within a tag img

<img _ngcontent-games2-c5="" class="mdc-image-list__image ng-lazyloaded" offset="100" src="/assets/img/lazy-load.jpg" srcset="https://juegosv.com/wp-content/uploads/2019/06/powerline-io.jpg" alt="Powerline.io">

My code written in Python is as follows:

from bs4 import BeautifulSoup
from requests import get

url = get("http://jogos.io").text

soup = BeautifulSoup(url, "html5lib")

uls = soup.find("ul", {
    "class": "mdc-image-list"
})


imgs = uls.findAll("img")

print(f"{imgs}\n")

It returns me a list with several image tags:

[<img _ngcontent-sc7="" alt="Powerline.io" class="mdc-image-list__image" offset="100" src="/assets/img/lazy-load.jpg"/>, <img _ngcontent-sc7="" alt="tacticscore.io" class="mdc-image-list__image" offset="100" src="/assets/img/lazy-load.jpg"/>, ...]

But the attribute srcset does not appear in any of them. I can only see this attribute if I inspect element within the site through the browser.

Questions: Can you do that? What I need to do?

  • 1

    You may need to wait for the execution of some javascript script, using the selenium probably will. But what’s the point? download the images?

  • Actually the goal is learning, man. I’m seeing tutorials on the internet about Django, then I had the idea to make a site that "takes" the information from other sites and displays in mine without me having to download anything, understand? In order to learn only.

  • I really analyzed a Javascript file and saw that from there it inserts this srcset in the HTML. I can use the Selenium for that? I only know the Selenium to make automations.

  • Theoretically it should work with Selenium since it "simulates" a browser, but in reality it would need to test because with javascript things get a little unpredictable. But it would just replace the requests with Selenium.

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.