2
I have the following situation:
<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>
Using Beautifulsoup, as I do to get only the href and the text of the 'a' tag inside the 'H3'?
2
I have the following situation:
<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>
Using Beautifulsoup, as I do to get only the href and the text of the 'a' tag inside the 'H3'?
3
The easiest way is to search inside the element h3
the tag a
:
from bs4 import BeautifulSoup
code = '''<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">
Google
</a>
</h3>'''
soup = BeautifulSoup(code)
tag_a = soup.h3.a
print(tag_a.text)
print(tag_a['href'])
It is also possible to search all tags with soup.h3.findAll('a')
, return will be a list of all searched tags.
2
Just fetch the tag h3
and then fetch the element a
:
from bs4 import BeautifulSoup
data = """<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>"""
soup = BeautifulSoup(data)
div = soup.find('h3', class_='b')
a = div.find('a')
print a['href']
print a.text
Browser other questions tagged python-3.x beautifulsoup
You are not signed in. Login or sign up in order to post.