Web Scrapping Python

Asked

Viewed 65 times

0

It used code below to collect data of atms from the parameters in the terminal. Site was changed and I can not find the equivalent of this for data collection.

Would anyone know if I could find on the page inspection?

import requests

parametro = {'latitude':-23.5097042, 'longitude':-46.6717552, 'status':1, 'lista':1, 'limite':98, 'acessibilidade':''}

r = requests.get('https://www.banco24horas.com.br/index/busca-json-terminal',parametro)

1 answer

1

Using the browser developer tool window opened in the "network" tab while using the site for a box, we see that the request has changed to URL "https://www.banco24horas.com.br/ajax.php".

If we try the same parameters, however, we have an undefined error. This is because the parameters have changed, and now, copying from the same original request that we saw, are the following:

params = {'latitude': -23.5097042, 'longitude': -46.6717552, 'acess': 0}

If we request at that point we will have the following message:

<br />
<b>Notice</b>:  Undefined index: HTTP_X_REQUESTED_WITH in <b>/var/www/html/b24h/ajax.php</b> on line <b>17</b><br />
{"type":"console","text":"No Ajax"}

It indicates that something else is needed for the system to work: a specific header that is also sent, the X-Requested-With. If we include it as follows, the request returns the data:

import requests

params = {'latitude': -23.5097042, 'longitude': -46.6717552, 'acess': 0}

headers = {
    "X-Requested-With": "XMLHttpRequest",
}

r = requests.get("https://www.banco24horas.com.br/ajax.php", params=params, headers=headers)

print(r.json())
# {'Results': [{'id_atm': '47530', 'cd_atm': '47530', ...
  • Thanks Pedro. I would need to study Javascript to be able to support this type of reading and analysis?

  • For me it gave error below but I imagine it is something with proxy in the company. I will try again on the home pc. Anyway, thank you for this support. Sslerror: Httpsconnectionpool(host='www.banco24horas.com.br', port=443): Max retries exceeded with url: /ajax.php? latitude=-23.5097042&longitude=-46.6717552&acess=0 (Caused by Sslerror(Sslerror("bad Handshake: Error([('SSL routines', 'tls_process_server_certificate', 'Certificate Verify failed')])")))

  • @Diogoribeiro did not need to use javascript, only browser developer tools, which appear when pressing F12. As for the error, yes, it may be caused by the proxy. Since it seems to be SSL error, you may want to pass the argument verify=false pro requests.get.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.