python & Selenium - wait download finish to close the browser

Asked

Viewed 969 times

0

How can I just close the webdriver when my download is finished. Time Sleep didn’t help me much because times vary a lot.

follows my code:

ExportCSV = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.XPATH, '//div/div[2]/div/a')))
ExportCSV.click()
driver.quit()

2 answers

1

William, thank you so much, I found a code recently and forgot to share here with you.

def every_downloads_chrome(driver):
if not driver.current_url.startswith("chrome://downloads"):
    driver.get("chrome://downloads/")
return driver.execute_script("""
    var items = downloads.Manager.get().items_;
    if (items.every(e => e.state === "COMPLETE"))
        return items.map(e => e.fileUrl || e.file_url);
    """)
wait = WebDriverWait(driver, 120, 1).until(every_downloads_chrome)

It goes to the download page and checks whether it has been completed.

  • This is very interesting, a simple solution.

0

Can’t do this with Selenium (as far as I know and searched), what you can do is take the link URL and download it via Python directly, for example:

If it’s Python 2.x

I couldn’t test, I don’t use python2

from os.path import basename
from urllib import urlretrieve

...

ExportCSV = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.XPATH, '//div/div[2]/div/a')))

url = ExportCSV.get_attribute('href')

# Remove querystring (se houver)
arquivo = url[:url.find('?', 0)]

#remove espaços em branco e barras
arquivo = arquivo.strip().strip('/')

# Pega somente o nome
arquivo = basename(arquivo)

urlretrieve(url, nome)

driver.quit()

If it’s Python 3.x

from os.path import basename
from urllib import request

...


ExportCSV = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.XPATH, '//div/div[2]/div/a')))

url = ExportCSV.get_attribute('href')

# Remove querystring (se houver)
arquivo = url[:url.find('?', 0)]

#remove espaços em branco e barras
arquivo = arquivo.strip().strip('/')

# Pega somente o nome
arquivo = basename(arquivo)

with request.open(url) as response, open(arquivo, 'wb') as file:
    file.write(response.read())

driver.quit()

Sessions and cookies

The previous examples are basic, serve more to understand how you can try to solve, but it is important to note that website use cookies and sometimes links are only available through them, because the site by using a anti-crsf (with cookie/session) or session, which would prevent you from accessing the link via Python, however it is possible to circumvent, Selenium itself offers the method driver.get_cookies(), it returns the cookies of the current website and the path current (if there are exclusive cookies set for other paths in the same domain I believe this method does not return them, it is similar to document.cookie browser Javascript), when using it will return an object, something like (this example returned from google’s website, omitted some sensitive cookies):

[{'domain': 'google.com', 'expiry': 1569293037.639768, 'httpOnly': False, 'name': '1P_JAR', 'path': '/', 'secure': False, 'value': '2019-08-25-02'}, {'domain': 'google.com', 'expiry': 1571885037, 'httpOnly': False, 'name': 'OGPC', 'path': '/', 'secure': False, 'value': '19013527-1:'}, {'domain': 'www.google.com', 'expiry': 1566787437, 'httpOnly': False, 'name': 'UULE', 'path': '/', 'secure': False, 'value': '...'}, {'domain': 'google.com', 'expiry': 1582512236.772455, 'httpOnly': True, 'name': 'NID', 'path': '/', 'secure': False, 'value': '...'}]

So having the object containing cookies you must now pass the values to one of the functions of urllib (or another lib of your preference, after all in Python has more than one lib, native or not to the service).

To solve this you can use:

from http.cookiejar import Cookie, CookieJar

And set the "cookie jar" in urllib like this:

jar = CookieJar()

request_cookie = Cookie(0, cookie_name, cookie_value, port, port_specified, domain,
                            domain_specified, domain_initial_dot, path, path_specified,
                            secure, expires, discard, comment, comment_url, rest, rfc2109)

jar.set_cookie(request_cookie)

opener = request.build_opener(request.HTTPCookieProcessor(jar))

As soon as possible I will put a functional example

Browser other questions tagged

You are not signed in. Login or sign up in order to post.