5
For a while I have been studying how to use Beautifulsoup to be able to find tag content etc.
But I came across a problem where the content I want to find is inside a tag <script type="text/javascript">
and only using find("script")
, it finds only the tags <script>
, and, if I try to put find("script type="text/javascript")
, the code gives error.
def get_cod_produto(url):
response = requests.get(url)
data = response.text
soup = bs(data, 'html.parser')
body = soup.body
localizaScript = body.find('script type="text/javascript"')
texto = localizaScript.string
array = re.split('"', texto)
print(array)
get_cod_produto("https://www.kabum.com.br/cgi-local/site/listagem/listagem.cgi?string=rtx+2060&btnG=&pagina=2&ordem=3&limite=30&prime=false&marcas=[]&tipo_produto=[]&filtro=[]")
It returns this error when I post any information other than just script
:
AttributeError Traceback (most recent call last) <ipython-input-565-dfa6bc29cef9> in <module> ----> 1 get_cod_produto("https://www.kabum.com.br/cgi-local/site/listagem/listagem.cgi?string=rtx+2060&btnG=&pagina=2&ordem=3&limite=30&prime=false&marcas=[]&tipo_produto=[]&filtro=[]") <ipython-input-564-5e17a520cf0d> in get_cod_produto(url) 5 body = soup.body 6 localizaScript = body.find('script type="text/javascript"') ----> 7 texto = localizaScript.string 8 array = re.split('"', texto) 9 print(array) AttributeError: 'NoneType' object has no attribute 'string'
How can I pull the information from this tag?
I never used Beautifulsoup, but if the method
find
is based on valid CSS selectors, I can explain the problem. Basically,script type="text/javascript"
nay is a valid CSS selector. If you want to limit the search of a tag to a certain one attribute (as the attributetype
), should involve the name of this square brackets. Thus:script[type="text/javascript"]
.– Luiz Felipe