Beautiful Soup - Remove a tag keeping Text

Asked

Viewed 278 times

1

I have the following tags:

<p>Projeto N <sup>o</sup> 00.000, DE 00 DE JANEIRO DE 0000.</p>

I would like to remove the tag keeping the text. I needed it to stay that way:

<p>Projeto N o 00.000, DE 00 DE JANEIRO DE 0000.</p>
  • This html is invalid for Beautifulsoup because of </p> at first, rather than <p>, provided a response considering a valid html.

  • Thanks for the correction. Just what I needed

1 answer

2


Can use unwrap():

from bs4 import BeautifulSoup as bs

soup = bs('<p>Projeto N <sup>o</sup> 00.000, DE 00 DE JANEIRO DE 0000.</p>')

soup.sup.unwrap()     # <sup></sup>
print(soup)           # <p>Projeto N o 00.000, DE 00 DE JANEIRO DE 0000.</p>

Browser other questions tagged

You are not signed in. Login or sign up in order to post.