POST function of the httr package returns NA

Asked

Viewed 178 times

3

I’m trying to make a script on R to make a POST on the site: http://tabnet.datasus.gov.br/cgi/tabcgi.exe?sinannet/cnv/violebr.def, but I am not succeeding. The goal is to extract the generated data table after updating the data. Everything seems to be fine, but the POST (or even GET) function of the httr package returns NA and not an html, as expected in this case. Follow the code I’m using:

link <- 'http://tabnet.datasus.gov.br/cgi/tabcgi.exe?sinannet/cnv/violebr.def'

dados <- list("Linha" = "Munic%EDpio_de_notifica%E7%E3o",
           "Coluna" = "--N%E3o-Ativa--",
           "Incremento" = "Freq%FC%EAncia",
           "Arquivos" = "violeac15.dbf",
           "pesqmes1" = "Digite+o+texto+e+ache+f%E1cil",
           "SM%EAs_1%BA_Sintoma%28s%29" = "TODAS_AS_CATEGORIAS__",
           "pesqmes2" = "Digite+o+texto+e+ache+f%E1cil",
           "SMunic%EDpio_de_notifica%E7%E3o" = "TODAS_AS_CATEGORIAS__",
           "SRegi%E3o_de_Sa%FAde_%28CIR%29_de_notif" = "TODAS_AS_CATEGORIAS__",
           "SMacrorreg.de_Sa%FAde_de_notific" = "TODAS_AS_CATEGORIAS__",
           "SDiv.adm.estadual_de_notific" = "TODAS_AS_CATEGORIAS__",
           "SMicrorregi%E3o_IBGE_de_notific" = "TODAS_AS_CATEGORIAS__",
           "SReg.Metropolit%2FRIDE_de_notific" = "TODAS_AS_CATEGORIAS__",
           "pesqmes8" = "Digite+o+texto+e+ache+f%E1cil",
           "SMunic%EDpio_de_resid%EAncia"  = "TODAS_AS_CATEGORIAS__",
           "SRegi%E3o_de_Sa%FAde_%28CIR%29_de_resid" = "TODAS_AS_CATEGORIAS__",
           "SMacrorreg.de_Sa%FAde_de_resid%EAnc"  = "TODAS_AS_CATEGORIAS__",
           "SDiv.adm.estadual_de_resid%EAncia" = "TODAS_AS_CATEGORIAS__",
           "SMicrorregi%E3o_IBGE_de_resid%EAnc" = "TODAS_AS_CATEGORIAS__",
           "SReg.Metropolit%2FRIDE_de_resid" = "TODAS_AS_CATEGORIAS__",
           "pesqmes14" = "Digite+o+texto+e+ache+f%E1cil",
           "SUF_Ocorr%EAncia" = "TODAS_AS_CATEGORIAS__",
           "SCiclo_de_Vida" = "TODAS_AS_CATEGORIAS__",
           "pesqmes16" = "Digite+o+texto+e+ache+f%E1cil",
           "SFaixa_Et%E1ria" = 2,
           "SSexo" = 4,
           "SRa%E7a" = 2,
           "pesqmes19"  = "Digite+o+texto+e+ache+f%E1cil",
           "SEscolaridade" = 2,
           "pesqmes20" = "Digite+o+texto+e+ache+f%E1cil",
           "SLocal_ocorr%EAncia" = "TODAS_AS_CATEGORIAS__",
           "SViol_repeti%E7%E3o" = "TODAS_AS_CATEGORIAS__",
           "SLes%E3o_Autoprov" = "TODAS_AS_CATEGORIAS__",
           "SViol_F%EDsica" = "TODAS_AS_CATEGORIAS__",
           "SViol_Psico%2Fmoral" = "TODAS_AS_CATEGORIAS__",
           "SViol_Tortura" = "TODAS_AS_CATEGORIAS__",
           "SViol_Sexual"  = "TODAS_AS_CATEGORIAS__",
           "STraf._Seres_Huma"  = "TODAS_AS_CATEGORIAS__",
           "SViol_Finan%2FEcono" = "TODAS_AS_CATEGORIAS__",
           "SViol_Negli%2FAband"  = "TODAS_AS_CATEGORIAS__",
           "SViol_Trab._Infant"  = "TODAS_AS_CATEGORIAS__",
           "SViol_Interv_Legal" = "TODAS_AS_CATEGORIAS__",
           "SOutra_Violencia" = "TODAS_AS_CATEGORIAS__",
           "SFor%E7_corp._espanc" = "TODAS_AS_CATEGORIAS__",
           "SEnforcamento" = "TODAS_AS_CATEGORIAS__",
           "SObj._Contundente" = "TODAS_AS_CATEGORIAS__",
           "SObj._perf-cortant" = "TODAS_AS_CATEGORIAS__",
           "SSubs_Obj_Quente"  = "TODAS_AS_CATEGORIAS__",
           "SEnvenenamento" = "TODAS_AS_CATEGORIAS__",
           "SArma_de_fogo" = "TODAS_AS_CATEGORIAS__",
           "SAmea%E7a" = "TODAS_AS_CATEGORIAS__",
           "SOutra_Agress%E3o" = "TODAS_AS_CATEGORIAS__",
           "SAss%E9dio_Sexual" = "TODAS_AS_CATEGORIAS__",
           "SEstupro" = 1,
           "SAtent._viol_pudor" = "TODAS_AS_CATEGORIAS__",
           "SPornog_Infantil" = "TODAS_AS_CATEGORIAS__",
           "SExplora%E7%E3o_Sexual" = "TODAS_AS_CATEGORIAS__",
           "SOutras_Violencias" = "TODAS_AS_CATEGORIAS__",
           "SSusp._uso_alcool" = "TODAS_AS_CATEGORIAS__",
           "SPai" = "TODAS_AS_CATEGORIAS__",
           "SM%E3e" = "TODAS_AS_CATEGORIAS__",
           "SPadrasto" = "TODAS_AS_CATEGORIAS__",
           "SMadrasta" = "TODAS_AS_CATEGORIAS__",
           "SConjuge" = "TODAS_AS_CATEGORIAS__",
           "SEx-Conjuge" = "TODAS_AS_CATEGORIAS__",
           "SNamorado%28a%29" = "TODAS_AS_CATEGORIAS__",
           "SEx-Namorado%28a%29"  = "TODAS_AS_CATEGORIAS__",
           "SFilha%28a%29" = "TODAS_AS_CATEGORIAS__",
           "SIrm%E3o%28a%29" = "TODAS_AS_CATEGORIAS__",
           "SAmigos%2FConhec" = "TODAS_AS_CATEGORIAS__",
           "SDesconhecida%28a%29" = "TODAS_AS_CATEGORIAS__",
           "SCuidador%28a%29" = "TODAS_AS_CATEGORIAS__",
           "Spatrao%2FChefe" = "TODAS_AS_CATEGORIAS__",
           "SPes_com_Rel_Inst" = "TODAS_AS_CATEGORIAS__",
           "SPolicial_Ag.Lei" = "TODAS_AS_CATEGORIAS__",
           "SPropria_Pessoa" = "TODAS_AS_CATEGORIAS__",
           "SOutros_V%EDnc" = "TODAS_AS_CATEGORIAS__",
           "SEnc._Setor_Saude" = "TODAS_AS_CATEGORIAS__",
           "SEvolu%E7%E3o_do_caso" = "TODAS_AS_CATEGORIAS__",
           "zeradas" = "exibirlz",
           "formato" = "table",
           "mostre" = "Mostra")

httr::POST(url = link, body = dados)

By executing these commands, we have:

Response [http://tabnet.datasus.gov.br/cgi/tabcgi.exe?sinannet/cnv/violebr.def]
Date: 2017-04-02 03:24
Status: 200
Content-Type: text/html
Size: 1.74 kB
NA

Note that, the status is 200 (indicating that everything went well), but we do not have an html! Any idea to solve the problem?

  • 1

    The left side of the data list does not take quotes, as you are assigning values. In addition, you have already tried to obtain these data via the Datasus package: https://github.com/danicat/datasus?

  • I know the package, but I ended up trying to solve the problem this way to train web scraping. I will test the two suggestions, thank you very much!

  • Oi @José, at the moment the package only has available data from the Mortality Information System. I’m looking for information regarding rape cases in Brazil. However, thank you very much!

  • you know the name of the base on which the rape cases are listed?

  • In the Finance (Notifiable Diseases Information System), @José.

1 answer

2

Rumenick, is an encoding problem. The httr tries to turn the output into UTF-8 and when it does, leaves NA.

The solution is to manually insert the encoding, which in case is 'latin1':

req <- httr::POST(url = link, body = dados)
html <- httr::content(req, 'text', encoding = 'latin1')
html

[1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML ...

Don’t forget to use appropriate packages to read page elements instead of working with the string. things like:

library(rvest)
html_read(html) %>%
  html_node('#idDoQueVcPrecisa') %>%
  html_text()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.