take a string string string string string string string [PYTHON]

Asked

Viewed 745 times

0

I’m trying to get the specific value of a return I made using requests

Python returned all the site’s HTML in print(Return.text), but I want to get only this input:

<input class="give-input required" type="text" name="descricao" autocomplete="given-name" placeholder="First Name" id="descricao" value="Roberta Bellamy" required="" aria-required="true">

I’m new to python!

  • Here is the input I want to assign a variable to it: <input class="Give-input required" type="text" name="Description" autocomplete="Given-name" placeholder="First Name" id="Description" value="Roberta Bellamy" required="" Aria-required="true">

  • Welcome to Sopt, I recommend stopping by tour, and read a little about the Minimum, Complete and Verifiable to create questions. The way it is, the code cannot be reproduced and tested. What have you managed to do?

  • But regardless of what the full HMTL and its code is, I believe you can use regex, string manipulation or even the library Beautifulsoup.

2 answers

3

Even if I haven’t put enough information in, I’ll demonstrate how easy it is to do this with the Beautifulsoup.

Beautifulsoup is a webscrapping-focused library in HTML and XML data search. Use is simple. In case, after installing (pip install bs4), you import in bs4 the main class Beautifulsoup and install it with an html and a parser of preference (such as html parser. or the lxml).

To find a specific tag, enter the minimum number of information to find that single tag instead of several of the same type.

Translating to a ready code would be:

from bs4 import BeautifulSoup

textoHTML = '<html><head></head><body><input aria-required="true" autocomplete="given-name" class="give-input required" id="descricao" name="descricao" placeholder="First Name" required="" type="text" value="Roberta Bellamy"/></body></html>' #SEU HTML ARQUI

soup = BeautifulSoup(textoHTML, 'html.parser')

print(soup.find("input", class_=['given-input', 'required'], id='descricao'))
# IMPRIME: <input aria-required="true" autocomplete="given-name" class="give-input required" id="descricao" name="descricao" placeholder="First Name" required="" type="text" value="Roberta Bellamy"/>

0

You can use the replace to remove the tags in this way:

linha1 = "<p>Conceito <span>e</span> <span>Signi</span>ficado<span> </span>de <span>Tex</span>to.</p>"
linha2 = "<p><span>Geralmente</span>, entendemos <span>o</span> texto como um <span>conjunto</span> de frases</p>"

linha1 = linha1.replace('<span>','').replace('</span>','')
linha2 = linha2.replace('<span>','').replace('</span>','')

print(linha1)
print(linha2)

Exit:

<p>Conceito e Significado de Texto.</p>
<p>Geralmente, entendemos o texto como um conjunto de frases</p>
  • If you want to remove the tags paragraph could make this way: linha1.replace('<span>','').replace('</span>','').replace('<p>','').replace('</p>','')

Browser other questions tagged

You are not signed in. Login or sign up in order to post.