-1
I’m trying to extract the value between two HTML tags with Python, I need it between two tags same.
I was doing it this way to extract values from a store catalog. But now I have a need to extract value from a specific product. That is, from a product page. I’d like to do something close to Delphi’s Posex.
The idea is to download the HTML content from the page, make a search of a string in the text, using an Initial String and Final String and return me the value between the two.
from urllib.request import urlopen
url = "https://www.panvel.com/panvel/main.do"
pagina = urlopen(url)
texto = pagina.read().decode('utf8')
texto = texto.replace("\t", "")
lista = texto.split("\n")
lista = texto.replace('\n', '')
htmlInicio = '<span class="box-produto__detalhes-nome">'
htmlFim = '</span>'
contador = 0
while contador < len(lista):
if lista[contador].startswith(htmlInicio):
#print(lista[contador])
nEncontrado1 = len(htmlInicio)+(lista[contador].index(htmlInicio))
nEncontrado2 = lista[contador].index(htmlFim)
nomeProduto = lista[contador][nEncontrado1:nEncontrado2]
#print(nomeProduto)
contador+=1
And why the solution is so specific that it makes it impossible to use xpath, which solves the problem in a simple way?
– Woss
You have to help whoever’s helping you! It is annoying to answer a question in the best possible way, with all the whim, to know that there is an artificial restriction preventing the solution. Please, edit the question and describe all artificial restrictions, stating the reason and the extent to which it is restricted.
– nosklo