How to use beatifulsoup to search for a certain word on the page

Asked

Viewed 54 times

-2

import requests
from bs4 import BeautifulSoup

url = "https://www.nike.com.br/Snkrs/Produto/Dunk-High-SP/153-169-211-279300"

req = requests.get(url)

html = req.text

soup = BeautifulSoup(html, 'html.parser')

tamanho = soup.main.find('script', attrs={'Tamanho' : 'keywords'})

print(tamanho)

I would like to search in the html element of the page these words size which is where I can find the stock available for each shoe!

But in several attempts the print only comes out NONE.

Can anyone help me? I would like to do something more advanced like looking for that word, size or stock I print

SIZE 41 : 2

SIZE 42 : 5

and so on...

  • Need to be done even with Beautiful Soup? Take a test with me, go to your sample page and then open your browser console and type in window.open("about:blank", "", "_blank").document.write(JSON.stringify(SKUsCorTamanho)); look at the window you opened and then answer me: these are the data you want to extract?

1 answer

0

To recover the javascript object data inside the script TAG:

import requests
from bs4 import BeautifulSoup
import json


html_text = requests.get('https://www.nike.com.br/Snkrs/Produto/Dunk-High-SP/153-169-211-279300').content

soup = BeautifulSoup(html_text, 'html.parser')

json_text = soup.find(id='produto').script.contents[0].split(' = ', 1)[1]

for tenis in json.loads(json_text).values():
    print(f'TAMANHO: {tenis["Tamanho"]} : {tenis["TemEstoque"]}')

One tip is to use lib lxml which is faster and supports xpath

Browser other questions tagged

You are not signed in. Login or sign up in order to post.