Webscraping of pictures in comments

Question

Webscraping of pictures in comments

Asked 6 years ago

Viewed 45 times

0

I’m working on a web scraper that needs to redeem comments in a forum that allows the upload images. The text and author of the commentary was able to obtain using a findAll in Beautiful Soup, but I couldn’t get a way to save the links associated with the comments (not all comments have images to get a link)

The code section has how I got the comments and how I tried to get the link images

title_comentou = container.findAll("div",{"class":"posting fullpost"})
comentario = title_comentou[0].text

title_imgem_link = container.findAll("div",{"img":"src"})
linkado = title_imgem_link[0].text

getting that error:

Traceback (most recent call last):
  File "2 - localBotBS4.py", line 54, in <module>
    linkado = title_imgem_link[0].text
IndexError: list index out of range

It seems that title_imgem_link does not have the value you expect to have. You have already verified what is the value of it?

– Woss

2019/07/15 at 17:16
I don’t understand any of this beautifulsoup, but I think the problem is in your assignment, you are assigned the value returned by the method to a simple variable and not to a list. Why this error occurs: IndexError: list index out of range. The right thing would be:title_comentou.append(container.findAll("div",{"class":"posting fullpost"}))

– Eudson Durães

2019/07/15 at 17:34
Have you seen the find_all documentation? When you do container.findAll("div",{"img":"src"}) you are looking for a div with the attribute img being of value src, that is, you are looking for <div img="src">... I suppose that’s not what you want...

– fernandosavio

2019/07/15 at 17:54
without having the structure of the page has not to say much

– Lucas Miranda

2019/07/15 at 18:39

No answers

Browser other questions tagged python beautifulsoup

You are not signed in. Login or sign up in order to post.