I understand your "question" as "how can I generalize the code below?"
The following version simplifies the execution of your code using str.format(), a method of str objects that allows you to fill a string with any value by replacing the key pairs with the number parameter entered within the pair (eg:"{0}").
You should also avoid the while: counter structure, where a WHILE loop runs while a variable is incremented. It would be the equivalent of using a Ferrari in a pizza delivery service: theoretically possible, but a beautiful (and expensive) waste of resources. Instead, use the "for X in range(limit):" structure, where Python will generate all elements between 0 and limit, store the current element in X, and run the next code block for each possible value of X, in order.
Finally, you should also define your constants at the beginning of the code, making it easier to modify them in the future. This greatly facilitates maintenance as it improves readability.
from bs4 import BeautifulSoup
import urllib2
import re
WEBSERVICE = "http://www.fiepb.com.br/industria/pesquisa.php"
QUERY_VARIAVEL = "page=Numconsulta_cadastro="
QUERY_CONSTANTE = "&totalRows_consulta_cadastro=3372&empresa=&cidade=&atividade=&produto=&materiaprima=&classificador=RAZAOSOCIAL&dados=on&Submit=Enviar+Consulta"
for email_num in range(3):
html_page = urllib2.urlopen("{0}?{1}{2}{3}".format(WEBSERVICE,
QUERY_VARIAVEL,
email_num,
QUERY_CONSTANTE))
soup = BeautifulSoup(html_page)
for link in soup.findAll('a', attrs={'href': re.compile("^mailto:")}):
print link.get('href')
PRO tip: take a look at the requests module :)
tried repeating links instead of using a loop with values ?
– Matheus Rocha
Hello bigown, thank you for your attention. Are 68 links. How to do?
– Alandey
Managed to solve your problem?
– durtto