Python: requests.get("https://pt.stackoverflow.com/") never returns anything

Asked

Viewed 247 times

0

When trying to use a requests.get(url) I get no response from the server, but adding kwarg
timeout=1 for example, I get the answer after 1 second...
example:\

import requests

url = "https://google.com/"
r = requests.get(url, timeout=1)
print(r.elapsed)

get

0:00:01.211611.

use

r = requests.get(url, timeout=5)

get:

0:00:05.223328.

From what I understand the function only returns me something when it reaches the timeout. Maybe I’m doing something really wrong, but I think I should get a response from get as soon as the server responds and not after the timeout...

I obtain the Following result by printing the r.Dict (removed the content)

_content_consumed True
_next None
status_code 200
headers {'Date': 'Mon, 29 Jun 2020 20:40:42 GMT', 'Expires': '-1', 'Cache-Control': 'private, max-age=0', 'Content-Type': 'text/html; charset=ISO-8859-1', 'P3P': 'CP="This is not a P3P policy! See g.co/p3phelp for more info."', 'Content-Encoding': 'gzip', 'Server': 'gws', 'Content-Length': '5387', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'Set-Cookie': '1P_JAR=2020-06-29-20; expires=Wed, 29-Jul-2020 20:40:42 GMT; path=/; domain=.google.com; Secure, NID=204=AuH0fx2X4m3kT6AeVtg0YMDEGr6uehL7Kt8WyzO7cmIlNDq_qnh4QXcUybI9aPOMAuC8_PuHsidpBN--vMfU1jJRreb2lM340XOSv2-CZAkK1qfXbrSSii9cRG-uX1caNB3HlnL4QDjErvgcYPtedlatyvLEaLALJ4Lj0aigT7c; expires=Tue, 29-Dec-2020 20:40:42 GMT; path=/; domain=.google.com; HttpOnly'}
raw <urllib3.response.HTTPResponse object at 0x7f1cb7b5f0b8>
url http://www.google.com/
encoding ISO-8859-1
history []
reason OK
cookies <RequestsCookieJar[<Cookie 1P_JAR=2020-06-29-20 for .google.com/>, <Cookie NID=204=AuH0fx2X4m3kT6AeVtg0YMDEGr6uehL7Kt8WyzO7cmIlNDq_qnh4QXcUybI9aPOMAuC8_PuHsidpBN--vMfU1jJRreb2lM340XOSv2-CZAkK1qfXbrSSii9cRG-uX1caNB3HlnL4QDjErvgcYPtedlatyvLEaLALJ4Lj0aigT7c for .google.com/>]>
elapsed 0:00:05.147145
request <PreparedRequest [GET]>
connection <requests.adapters.HTTPAdapter object at 0x7f1cb6d8e1d0>
[Finished in 6.0s]

I also realized that in simpler html pages like: http://www.brainjar.com/java/host/test.html
the problem does not happen and I get the answer almost immediately.
0:00:00.212320.

2 answers

1

The fact that I return only at the end of the timeout leads me to believe that the request is failing. I ran the code pretty much the way it is and it worked. Is it possible that your firewall is blocking the python executable? Try printing the status code.

url = "https://google.com/"
r = requests.get(url, timeout=10)
print(r.elapsed)
print(r.status_code)

0:00:00.180191
200

0

I just found out that the problem occurs when I request using ipv6.
When using ipv4, the problem does not occur.

That code taken from: https://stackoverflow.com/questions/33046733/force-requests-to-use-ipv4-ipv6 User reply:Jeff Kaufman, solved my problem.

# Monkey patch to force IPv4, since FB seems to hang on IPv6
import socket
old_getaddrinfo = socket.getaddrinfo
def new_getaddrinfo(*args, **kwargs):
    responses = old_getaddrinfo(*args, **kwargs)
    return [response
            for response in responses
            if response[0] == socket.AF_INET]
socket.getaddrinfo = new_getaddrinfo

This forces the request with ipv4

Browser other questions tagged

You are not signed in. Login or sign up in order to post.