4
Good afternoon. I am developing a script that:
- accesses a system;
- within the environment, you will find certain information;
- generates a kind of report;
- creates a spreadsheet with the data.
My problem is even before the parse. I can access the environment that contains the information, but I can’t get the Selenium webdriver to locate the elements you need to click to access the data that will be in the report.
I have the impression that it is javascript that is causing the confusion, because the information of the frame that "shoots" javascript is accessible, and the page with the result, visible to me, does not seem visible to the script.
How to get around javascript?
How to make the webdriver "see" the final page in the same way I see it?
(EDITED. Code below:)
from selenium import webdriver
import time
from selenium.common.exceptions import NoSuchFrameException
import os
if os.path.exists('c:\\projudi') == False:
os.makedirs('c:\\projudi')
try:
planilha = open('c:\\projudi\\relatorio.csv', 'r+')
except FileNotFoundError:
planilha = open('c:\\projudi\\relatorio.csv', 'w+')
browser = webdriver.Chrome()
browser.get('https://projudi.tjpr.jus.br/projudi')
time.sleep(20)
browser.switch_to_frame('mainFrame')
browser.switch_to_frame('userMainFrame')
links = browser.find_elements_by_class_name('link')
n = len(links)
for x in range(0, n, 2):
if links[x].text != ('0'):
links[x].click()
time.sleep(2)
try:
browser.switch_to_frame('mainFrame')
browser.switch_to_frame('userMainFrame')
a = browser.find_elements_by_class_name('link')
except NoSuchFrameException:
a = browser.find_elements_by_class_name('link')
if a != []:
q = browser.find_elements_by_class_name('resultTable')
w = q[0].text
for x in range(len(w)):
dados = w.split('\n')
for x in range(len(dados)):
planilha.writelines(dados[x])
for x in range(int(len(a))):
a[x].click()
time.sleep(2)
browser.back()
time.sleep(2)
browser.switch_to_frame('mainFrame')
browser.switch_to_frame('userMainFrame')
a = browser.find_elements_by_class_name('link')
browser.back()
time.sleep(2)
else:
browser.back()
time.sleep(2)
browser.switch_to_frame('mainFrame')
browser.switch_to_frame('userMainFrame')
links = browser.find_elements_by_class_name('link')
planilha.close()
browser.close()
My question: when I access the screen that contains the information I need (resultTable), I capture it whole and Gero a variable with a string containing all the data. I split it, and I got a list of strings. So far ok, I play everything to the report file for further processing. Now... how to control FLOW? I already know that I will have to deal with in the list the string that contains the DATA with regex, because I only need to access the information of the present day until 2 days ago. But how to use this information as a REFERENCE pro Python? Example: The scrip captures the table and plays for a list like this:
list = ['0004434-48.2010', 'UNION' '(30 working days) 03/07/2017', '13/07/2017', '0008767-77.2013', '2017' '(10 working days) 03/07/2017', '13/07/2017']
The first item in the list is the first item in the table, row 1 and column 1. It contains the link. The control date is in the THIRD item, row 1 column 3. And item 5 is already the next row (row 2, column 1). I don’t know if I could explain! =/
I need to: 1 - check the date. If it is today or yesterday: Click on the first item on that line. If it’s not, move on to the next line.
I think it’s ideal to put the code where it goes, and then the mistake for us to see....
– MagicHat
If possible, also enter the link to the site in question. It will only be possible to determine how to capture the information if we can diagnose the field.
– Denis Callau
I edited following the guidelines of friends. It contains the link, and it also contains the code. In vdd I have already managed to overcome this step, giving switch_to_frame twice (without really understanding why I needed the two steps, but how it worked I left). Now I am at the moment to effectively capture the information. Unfortunately the system requires login and password...
– Bergo de Almeida
@Can Bergodealmeida export html after login? From the page where it is actually extracting the data.
– rodrigorf