0
I’m working on a web scraper that needs to redeem comments in a forum that allows the upload images. The text and author of the commentary was able to obtain using a findAll
in Beautiful Soup, but I couldn’t get a way to save the links associated with the comments (not all comments have images to get a link)
The code section has how I got the comments and how I tried to get the link images
title_comentou = container.findAll("div",{"class":"posting fullpost"})
comentario = title_comentou[0].text
title_imgem_link = container.findAll("div",{"img":"src"})
linkado = title_imgem_link[0].text
getting that error:
Traceback (most recent call last):
File "2 - localBotBS4.py", line 54, in <module>
linkado = title_imgem_link[0].text
IndexError: list index out of range
It seems that
title_imgem_link
does not have the value you expect to have. You have already verified what is the value of it?– Woss
I don’t understand any of this beautifulsoup, but I think the problem is in your assignment, you are assigned the value returned by the method to a simple variable and not to a list. Why this error occurs:
IndexError: list index out of range
. The right thing would be:title_comentou.append(container.findAll("div",{"class":"posting fullpost"}))
– Eudson Durães
Have you seen the find_all documentation? When you do
container.findAll("div",{"img":"src"})
you are looking for adiv
with the attributeimg
being of valuesrc
, that is, you are looking for<div img="src">
... I suppose that’s not what you want...– fernandosavio
without having the structure of the page has not to say much
– Lucas Miranda