Download html from python post login screen

Asked

Viewed 174 times

1

Hello, I’m wondering if there is any way to get the url of a site, the password and login strings and simulate a clone to go to the next page (already logged in) and download its source code. Note: I just want to use the urllib library and I’m using python version 3.2. I would appreciate it if you put the code in so that it was possible, here are the urls:

My code so far (copied from the net and tried to modify..):

import http.cookiejar
import urllib

# Store the cookies and create an opener that will hold them
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))

# Add our headers
opener.addheaders = [('User-agent', 'RedditTesting')]

# Install our opener (note that this changes the global opener to the one
# we just made, but you can also just call opener.open() if you want)

urllib.request.install_opener(opener)

# The action/ target from the form
au = 'http://portaldoaluno.dombarreto.g12.br/corpore.net/Login.aspx'
lu = 'http://portaldoaluno.dombarreto.g12.br/Corpore.Net/Main.aspx?SelectedMenuIDKey=&ShowMode=2'
# Input parameters we are going to send
payload = {
  'btnLogin': 'Acessar',
  'txtUser': 'login',
  'txtPass': 'senha',
  'ddlAlias': 'CorporeRM'
  }

# Use urllib to encode the payload
data = urllib.parse.urlencode(payload).encode("utf-8")
print(data)
# Build our Request object (supplying 'data' makes it a POST)
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Windows i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"

req = urllib.request.Request(url=au, data=data, headers=headers)

# Make the request and read the response
resp = urllib.request.urlopen(req)
contents = resp.read().decode("iso8859-1")
f = open("/sdcard/arq.html", "w")
f.write(contents)
f.close()


print(contents)
No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.