I’m working on a web scraper that needs to redeem comments in a forum that allows the upload images. The text and author of the commentary was able to obtain using a findAll in Beautiful Soup, but I couldn’t get a way to save the links associated with the comments (not all comments have images to get a link)

The code section has how I got the comments and how I tried to get the link images

title_comentou = container.findAll("div",{"class":"posting fullpost"})
comentario = title_comentou[0].text

title_imgem_link = container.findAll("div",{"img":"src"})
linkado = title_imgem_link[0].text

getting that error:

Traceback (most recent call last):
  File "2 - localBotBS4.py", line 54, in <module>
    linkado = title_imgem_link[0].text
IndexError: list index out of range
  • It seems that title_imgem_link does not have the value you expect to have. You have already verified what is the value of it?

  • I don’t understand any of this beautifulsoup, but I think the problem is in your assignment, you are assigned the value returned by the method to a simple variable and not to a list. Why this error occurs: IndexError: list index out of range. The right thing would be:title_comentou.append(container.findAll("div",{"class":"posting fullpost"}))

  • Have you seen the find_all documentation? When you do container.findAll("div",{"img":"src"}) you are looking for a div with the attribute img being of value src, that is, you are looking for <div img="src">... I suppose that’s not what you want...

  • without having the structure of the page has not to say much

No answers

