1
I am using Selenium to access the site http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-ajustes-do-pregao-ptBR.asp and manipulate the date box and ok button. So far I managed to do the task successfully.
import pandas as pd
import requests
from bs4 import BeautifulSoup
import urllib
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
import datetime
import shutil
from time import sleep
import os
options = webdriver.ChromeOptions()
options.add_experimental_option("prefs", {
"download": {"prompt_for_download": False} })
options.add_experimental_option('useAutomationExtension', False)
#Realizando a chamada do Driver do Chrome e abertura do site
g = webdriver.Chrome()
#g.get('https://www.google.com/')
gg = g.get('http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-ajustes-do-pregao-ptBR.asp')
sleep(10)
t_dt = g.find_element_by_name('dData1') #g.find_element_by_xpath('//*[@id="dData1"]')
t_dt.clear()
t_dt.send_keys('24/09/2019')
sleep(5)
t_bt = g.find_element_by_xpath('//*[@id="divContainerIframeBmf"]/div[1]/div/form/div/div[2]/button')
t_bt.click()
#Data de atualização
g.find_element_by_xpath('//*[@id="divContainerIframeBmf"]/div[1]/div/form/div/div[3]/p').text
html = g.page_source.encode('utf-8')
soup = BeautifulSoup(html, 'lxml')
results = []
for row in soup.find_all('tr')[1:]:
data = row.find_all('td')
merc = data[0]
venc = data[1]
prec_ant = data[2]
prec_atu = data[3]
vari = data[4]
results.append({'Mercadoria':merc.text,
'Vencimento':venc.text,
'Preço de ajuste anterior':prec_ant.text,
'Preço de ajuste atual':prec_atu.text,
'Variação': vari.text,
})
df = pd.DataFrame(results)
df.head()
The result is close to expected, the problem occurs in the merchandise column, where there is a mixture and with that the first line of each merchandise loses the formatting.
You can list table data?
– Tmilitino
with you through this passage #Data extracted from table cols = g.find_elements_by_xpath('//*[@id="tblDadosAjustes"]/tbody/tr/td') for col in cols: print(col.text.split(' n'))
– LePy
puts this snippet of code and says that you can print the data, and its difficulty is to take this data to a datafreme, because it was not clear in the question. Thank you!
– Tmilitino
I made an edition in the complete code I believe it is clearer the doubt I have
– LePy
If one of the answers below solved your problem and there was no doubt left, choose the one you liked the most and mark it as correct/accepted by clicking on the " " that is next to it, which also marks your question as solved. If you still have any questions or would like further clarification, feel free to comment.
– Lucas