3
Well my program takes Urls from a text file, enters them and checks if it has a certain text in its HTML code, it would be possible to read all lines of the file, and instead of making a request at a time, do them all at once?
3
Well my program takes Urls from a text file, enters them and checks if it has a certain text in its HTML code, it would be possible to read all lines of the file, and instead of making a request at a time, do them all at once?
3
Asynchronously?
If so, one way to do this is to use the grequests
to make the requests. To install, in the terminal type:
pip install grequests
You can use it so (adapted from documentation):
# -*- coding: utf-8 -*-
import grequests
urls = [
'https://www.python.org/',
'https://pypi.python.org/pypi/grequests',
'/'
]
requisicoes = (grequests.get(url) for url in urls)
mp = grequests.map(requisicoes)
for request in mp:
print ("{}: {}".format(request.status_code, request.url))
#(print request.content)
To implement cookies with grequests
, do so:
sessao = grequests.Session()
cookies = { 'foo': 'bar', 'baz': 'bar' }
requisicao = sessao.get('http://httpbin.org/cookies', cookies = cookies)
print ("{}: {}".format(requisicao.status_code, requisicao.url))
print (requisicao.text)
The logic is the same for the requests
.
If you are using Python >= 3.2, the module concurrent.futures
can be useful for doing a task asynchronously. The example below uses the requests
to make the requisitions.
# -*- coding: utf-8 -*-
from concurrent.futures import ThreadPoolExecutor, as_completed
import requests
def get(url, timeout):
return requests.get(url, timeout = timeout)
def requestUrls(urls, timeout = 5):
with ThreadPoolExecutor(max_workers = 5) as executor:
agenda = { executor.submit(get, url, timeout): url for url in urls }
for tarefa in as_completed(agenda):
try:
conteudo = tarefa.result()
except Exception as e:
print ("Não foi possível fazer a requisição! \n{}".format(e))
else:
yield conteudo
The number of threads is defined in max_workers
, if omitted or None
, the default is the number of processors on the machine. Source
Use like this:
urls = [
'https://www.python.org/',
'https://pypi.python.org/pypi/requests',
'/',
]
requisicoes = requestUrls(urls) # timeout é opcional, o padrão é 5
for requisicao in requisicoes:
codigo = requisicao.status_code
url = requisicao.url
conteudo = requisicao.content
print ("{}: {}".format(codigo, url))
Browser other questions tagged python thread request
You are not signed in. Login or sign up in order to post.
Thank you very much for answering me, but in the case of grequest as q do to use cookies (PHPSESSID)?
– Leonardo Julio
@Leonardojulio An example has been set.
– stderr
Thank you so much for your help!! I am testing
– Leonardo Julio
@Leonardojulio Ok, anything just warn
– stderr