R - download CVM data via POST (package httr) method

Asked

Viewed 2,472 times

7

I am trying to build a R function to download multiple documents directly from the system provided by CVM.

The general instructions given by the CVM for the multiple download are described in: http://sistemas.cvm.gov.br/Port/DownloadArqs/download02.htm

In summary, access to the system depends on login and password authentication and requires informing the query parameters via the POST method. The system will respond in XML, providing the Urls to download the documents.

What I want to do is return in R the list containing these Urls. For this, I tried to write a simple function using the package "httr", described below:

sist_cvm <- "https://www.rad.cvm.gov.br/DOWNLOAD/SolicitaDownload.asp"

login <- list(txtLogin = "MEU_LOGIN", txtSenha = "MINHA_SENHA", txtData = "15/04/2015", txtHora = "00:00", txtDocumento = "4")

library(httr)

acesso <- POST(url=sist_cvm, body=login, encode="multipart", verbose())

However, when trying to run, returns the following error:

SSL Certificate problem: Error in Function (type, msg, asError = TRUE)

Note: I tried to make multiple combinations when writing the function POST, varying encode as form, multipart and json, and including or omitting verbose(). I also tried to replace the login and password elements in the list login for authenticate("MEU_LOGIN", "MINHA_SENHA"). In all cases, the same error was returned.

Could someone give me some suggestions, please?

Thank you!

  • Can’t test without user/password, but I found two answers (1, 2) on Soen that might help you. Basically, you need to indicate a valid certificate for the connection to the server to be successful.

  • Dear Molx, I haven’t been able to yet... the most I got when I added "verifypeer = FALSE" was to return the same result I got when I tried to access the CVM multiple download system without entering login and password (message "incorrect login".

  • But I have WAMP installed on my PC and can do multiple downloads only in the conventional way (manual filling) with the HTML code: <body> <form method="post" action="https://WW.RAD.CVM.GOV.BR/DOWNLOAD/SolicitaDownload.asp"> <p>Login: <input="text type name" ="txtLogin" value="MEU_LOGIN"> <p>Password: <input type="text" name="txtSenha" value="MINHA_SENHA"> <p>Date: <input type="text" name="txtData" value="13/03/2015"> <p>Time: <input type="text" name="txtHora" ="00:00 value"> <p>Document: <select name="txtDocument"> <option value="DFP" Selected="Selected">DFP </select></p>

  • Try changing the NCO to "form" instead of "Multipart", it is another thing to check if you are passing the certificate, as the link of the Molx post. cafile <- system.file("Curlssl", "cacert.pem", package = "Rcurl") access <- POST(url=sist_cvm, body=login, Encode="form", verbose(), config(cainfo = cafile))

  • Molx and Icarus, thank you for trying to help you. But you didn’t. I will think of another alternative for automating the download of the files. Access via R crashes even on authentication, unfortunately.

2 answers

2

It seems that in the latest version of httr package this problem is solved. Below, a code that worked:

cvm <- "https://WWW.RAD.CVM.GOV.BR/DOWNLOAD/SolicitaDownload.asp"

informs <- list(txtLogin = "seulogin", 
          txtSenha = "suasenha", 
          txtData = format(Sys.Date(), "%d/%m/%Y"), 
          txtHora = "00:00", 
          txtDocumento = "TODOS")

acesso <- POST(url = cvm, 
           body = informs, 
           encode = "form", 
           verbose())

Att.

  • Dear John Henry, Thank you for the tip. However, even with it I could not succeed, as a new error message appeared: "Error in Curl::curl_fetch_memory(url, Handle = Handle) : Peer!

  • I tried to investigate, but I don’t know how to solve this problem for now... I’m sorry!

  • Try using this command before the other httr::set_config(config( ssl_verifypeer = 0L ))

  • Dear John, thanks again for the tip, but it still doesn’t work. The message was now: "Error in Curl::curl_fetch_memory(url, Handle = Handle) : Couldn’t connect to server"

1

For record/history, follow an example of form, based on the reply of @tpiccarelli, whose data returns correctly:

<!-- http://www.cvm.gov.br/menu/regulados/companhias/download_multiplo/manual_tecnico.html -->
<body>
    <form method="post" action="http://seguro.bmfbovespa.com.br/rad/download/SolicitaDownload.asp">
        <p>Login: <input type="text" name="txtLogin" value="xxxxx"></p>
        <p>Senha: <input type="text" name="txtSenha" value="yyyyy"></p>
        <p>Data: <input type="text" name="txtData" value="26/04/2019"></p>
        <p>Hora: <input type="text" name="txtHora" value="00:00"></p>
        <p>Exibe Assunto IPE:
            <select name="txtAssuntoIPE">
                <option value="SIM" selected="selected">Sim</option>
                <option value="NÃO">Não</option>
            </select>
        </p>
        <p>Documento:
            <select name="txtDocumento">
                <option value="TODOS" selected="selected">TODOS</option>
                <option value="DFP">DFP</option>
                <option value="ENET">ENET</option>
                <option value="FCA">FCA</option>
                <option value="FRE">FRE</option>
                <option value="IAN">IAN</option>
                <option value="IPE">IPE</option>
                <option value="ITR">ITR</option>
                <option value="RAD">RAD</option>
                <option value="SEC">SEC</option>
            </select>
            <table style="margin-top: 10px;font-size: 12px">
                <thead>
                    <tr>
                        <td><b>Sigla</b></td>
                        <td><b>Tipo de documento</b></td>
                    </tr>
                </thead>
                <tbody>
                    <tr>
                        <td>DFP</td>
                        <td>Demonstrações Fianceiras Padronizadas</td>
                    </tr>
                    <tr>
                        <td>ENET</td>
                        <td>Programa Empresas.NET</td>
                    </tr>
                    <tr>
                        <td>FCA</td>
                        <td>Formulário Cadastral</td>
                    </tr>
                    <tr>
                        <td>FRE</td>
                        <td>Formulário de Referência</td>
                    </tr>
                    <tr>
                        <td>IAN</td>
                        <td>Informações Anuais</td>
                    </tr>
                    <tr>
                        <td>IPE</td>
                        <td>Informações Periódicas</td>
                    </tr>
                    <tr>
                        <td>ITR</td>
                        <td>Informações Trimestrais</td>
                    </tr>
                    <tr>
                        <td>RAD</td>
                        <td>Formulários ITR, DFP e IAN</td>
                    </tr>
                    <tr>
                        <td>SEC</td>
                        <td>Formulário de Securitiza/td>
                    </tr>
                </tbody>
            </table>
        </p>
        <button type="submit">Enviar</button>
    </form>
</body>

In time, for those looking for other simple alternatives, follows code snippet in python that would return the same result.

import requests

data = {
    'txtLogin': 'xxxxx', 
    'txtSenha': 'yyyyy', 
    'txtData': '26/04/2019', 
    'txtHora': '00:00', 
    'txtAssuntoIPE': 'SIM',
    'txtDocumento': 'TODOS',
}

url = 'http://seguro.bmfbovespa.com.br/rad/download/SolicitaDownload.asp'

r = requests.post( url = url, data = data )

for line in r.iter_lines():
    print( line )
  • The Python code worked perfectly

Browser other questions tagged

You are not signed in. Login or sign up in order to post.