9
In PHP we have a function called strip_tags
that remove HTML tags from a given text.
Example:
$text = "meu nome é <strong>Wallace</strong>";
strip_tags($text); // 'meu nome é Wallace'
How can I remove tags from a Python text?
9
In PHP we have a function called strip_tags
that remove HTML tags from a given text.
Example:
$text = "meu nome é <strong>Wallace</strong>";
strip_tags($text); // 'meu nome é Wallace'
How can I remove tags from a Python text?
11
There are several ways, but I don’t think there’s any better way to fulfill this role than Beautifulsoup:
>>> from bs4 import BeautifulSoup as bs
>>> bs('<p>hey<span> brrh </span>lolol', 'html.parser').text
'hey brrh lolol'
Note: To install in Python 3.5 use
pip
:pip install --upgrade beautifulsoup4
In-depth reading about Beautifulsoup
11
An example with regex would be so:
import re
text = 'meu nome é <strong>Wallace</strong>'
text = re.sub('<[^>]+?>', '', text)
print(text)
The function re.sub()
takes as first parameter a regular expression and searches in the content, defined by the third parameter, snippets that combine with the expression, replacing them with the content defined in the second parameter.
Browser other questions tagged html python
You are not signed in. Login or sign up in order to post.
Great, it worked on Python 2.7 and Python 3.*. +1
– Wallace Maxters
Yap, it’s one of the most used modules for what I do @Wallacemaxters
– Miguel
I updated your answer to instruct who doesn’t have the module installed and left a +1 for you =)
– Guilherme Nascimento
@Guilhermenascimento Thank you for adding to the answer, it really is important
– Miguel