Creating a program to get important news on a website

Asked

Viewed 761 times

1

from bs4 import BeautifulSoup
import requests

url = 'http://g1.com.br/'
header = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
                        'AppleWebKit/537.36 (KHTML, like Gecko) '
                        'Chrome/51.0.2704.103 Safari/537.36'}



req = requests.get(url,headers= header)

html = req.text

soup = BeautifulSoup(html,'html.parser')

colecao = soup.find_all(class_="feed-post-body-title gui-text-title gui-color-primary") #div eh texto


for item in colecao:
    print(item.get_text())

The code above should pick up the top news on the site http://g1.com.br/ or is with the tag:

<p class="feed-post-body-title gui-text-title gui-color-primary gui-color-hover">

Unfortunately, it is not doing anything and does not return error ("Process finished with Exit code 0"). Could anyone help me? I tested with python 2.7

1 answer

5


Your class is not complete:

feed-post-body-title gui-text-title gui-color-primary

Should be:

feed-post-body-title gui-text-title gui-color-primary gui-color-hover

In Beautifulsoup whenever you try to find items through an attribute never forget to put the full value of the attribute.

Successfully tested in python 2.7 and python 3.5.

  • 1

    Thanks, I’ll test @Ruilima. Spaces are part of the attribute?

  • 1

    class_ is with underline even?

  • 2

    Yes, class_ is with underline and spaces are part of the attribute value.

  • 1

    Thank you, @Ruilima!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.