How to send multiple requests at the same time

Asked

Viewed 860 times

3

Well my program takes Urls from a text file, enters them and checks if it has a certain text in its HTML code, it would be possible to read all lines of the file, and instead of making a request at a time, do them all at once?

1 answer

3

Asynchronously?

If so, one way to do this is to use the grequests to make the requests. To install, in the terminal type:

pip install grequests

You can use it so (adapted from documentation):

# -*- coding: utf-8 -*-
import grequests 

urls = [
    'https://www.python.org/',
    'https://pypi.python.org/pypi/grequests',
    '/'
]

requisicoes = (grequests.get(url) for url in urls)
mp = grequests.map(requisicoes)

for request in mp:
    print ("{}: {}".format(request.status_code, request.url))
    #(print request.content)

To implement cookies with grequests, do so:

sessao = grequests.Session()
cookies = { 'foo': 'bar', 'baz': 'bar' }

requisicao = sessao.get('http://httpbin.org/cookies', cookies = cookies)

print ("{}: {}".format(requisicao.status_code, requisicao.url))
print (requisicao.text)

The logic is the same for the requests.

Python 3.2+

If you are using Python >= 3.2, the module concurrent.futures can be useful for doing a task asynchronously. The example below uses the requests to make the requisitions.

# -*- coding: utf-8 -*-
from concurrent.futures import ThreadPoolExecutor, as_completed
import requests

def get(url, timeout):
    return requests.get(url, timeout = timeout)

def requestUrls(urls, timeout = 5):
    with ThreadPoolExecutor(max_workers = 5) as executor:
        agenda = { executor.submit(get, url, timeout): url for url in urls }

        for tarefa in as_completed(agenda):     
            try:
                conteudo = tarefa.result()
            except Exception as e:
                print ("Não foi possível fazer a requisição! \n{}".format(e))
            else:
                yield conteudo

The number of threads is defined in max_workers, if omitted or None, the default is the number of processors on the machine. Source

Use like this:

urls = [
    'https://www.python.org/',
    'https://pypi.python.org/pypi/requests',
    '/',
]

requisicoes = requestUrls(urls) # timeout é opcional, o padrão é 5

for requisicao in requisicoes:
    codigo = requisicao.status_code
    url = requisicao.url
    conteudo = requisicao.content

    print ("{}: {}".format(codigo, url))
  • Thank you very much for answering me, but in the case of grequest as q do to use cookies (PHPSESSID)?

  • @Leonardojulio An example has been set.

  • Thank you so much for your help!! I am testing

  • @Leonardojulio Ok, anything just warn

Browser other questions tagged

You are not signed in. Login or sign up in order to post.