I can’t install Ibgepesq in R to read the Pnads

Asked

Viewed 2,111 times

4

I just downloaded the 2013 PNAD of IBGE -- and I tried to open it with the package for R that IBGE itself elaborated, the Ibgepesq. It is available as . zip archive at this address:

ftp://ftp.ibge.gov.br/Trabalho_e_Rendimento/Pesquisa_Nacional_por_Amostra_de_Domicilios_anual/microdados/2013/Leitura_em_R.zip

I downloaded it into my Working directory. And ran:

    install.packages("IBGEPesq_1.0-4.zip",
                     repos=NULL)

    library(IBGEPesq)

But then I get the following message:

    Error: package ‘IBGEPesq’ was built before R 3.0.0: please re-install it

Obviously, I’ve tried to re-install it. And I’ve also run it here:

    # Para atualizar os demais pacotes
    update.packages(checkBuilt = TRUE, ask = FALSE)

    # Por recomendação em um fórum (não entendi bem por quê) 
    install.packages('codetools')

And even then, the re-installation does not work. The same error returns. I am using a Windows 7 and the R session data is:

    > sessionInfo()
    R version 3.1.1 (2014-07-10)
    Platform: x86_64-w64-mingw32/x64 (64-bit)

    locale:
    [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
    [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
    [5] LC_TIME=Portuguese_Brazil.1252    

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    loaded via a namespace (and not attached):
    [1] tools_3.1.1

Beforehand, I appreciate the help.

  • Try redoing the procedure using the R x32.

  • It doesn’t work either, Athos...

  • I was missing notice that the error you received informs that R has a q from a version earlier than 3.0.0. I think q in R 2.14.x x32 already runs. See if it goes

  • @Athos, probably in an earlier version of R should run. But it is not worth doing a downgrade just to run this package. The question is precisely to make it work in a current version...

  • @Rogeriojb think that the Ibgepesq is gone, so I have followed the package has been discontinued. If they made the source code available, we could easily adapt it, but as far as I know it’s not available anywhere. Although today there are functions for reading that are even faster than those of Ibgepesq.

  • @Flaviobarros even if tar.gz is not available, you can see the source code of the functions as well. For example, you can install the package and run ls(getNamespace("IBGEPesq"), all.names=TRUE) to get all available functions. From there just call edit(IBGEPesq:::nome_funcao) to open a text editor with the function. Abs

  • So @Carlos Cinelli, try doing it with Ibgepesq, will have a surprise...initially I was going to use the sources of the package to update it to R 3.0.0 at the time. Unfortunately, there is code compiled in this package.

Show 2 more comments

2 answers

3

I recommend using the script created by Damico and Djalma, which can be found here. I’ve worked with him a lot and it works perfectly.

You get the data via: download all Microdata. R, and then perform the analysis with single-year - analysis examples. R.

There you will also find scripts for SME, POF and PISA.

  • 1

    Hi @Jvlegend. Yes, I know Damico’s scripts. They’re really great. But I’m trying the IBGE package itself... wanted to know how to solve this particular problem. But it is very likely that they are no longer interested in investing in this package...

3

I know Damico’s scripts, but personally I prefer the solution I’m going to present here. Because the PNAD data is provided as microdata, it is sufficient to have the research dictionary that the reading is trivial using any reader with delimiter. For performance reasons I will use data.table here and a C++ function of the desc package that converts delimited text files into csv files, fwf2csv(). Then just use the fread() function of the data.table that reads csv’s super fast.

Initially you will need the dictionary and microdata, both of which can be downloaded here: http://www.ibge.gov.br/home/estatistica/populacao/trabalhoerendimento/pnad2012/microdados.shtm In the spreadsheet "Dictionary of Household Variables of the Basic Research - 2013.xls" export to a CSV the first three columns: Initial Position, Size, Variable Code. In my case I saved as dicdom.csv.

With the microdata in the Data folder run the script:

#############PREPARAÇÃO DE DADOS##########
library(bit64)
library(data.table)
library(descr)
library(xlsx)

## Criando o dicionário a partir das três primeiras colunas da planilha
dicdom <- read.csv(file = 'dicdom.csv', header=F)
dicdom <- dicdom[complete.cases(dicdom),]
colnames(dicdom) <- c('inicio', 'tamanho', 'variavel')

## Parâmetro com o final de cada campo
end_dom = dicdom$inicio + dicdom$tamanho - 1

## Converte o microdado para um arquivo csv
fwf2csv(fwffile='Dados/DOM2013.txt', csvfile='dadosdom.csv', names=dicdom$variavel, begin=dicdom$inicio, end=end_dom)

## Efetua a leitura do conjunto de dados com o fread do data.table
dadosdom <- fread(input='dadosdom.csv', sep='auto', sep2='auto', integer64='double')

And that’s it! There are 148,697 households, just check with nrow(). Repeat the procedure for people’s data.

  • An important point to note is that there is the Dicionariosibge package (http://arademaker.github.io/blog/2012/09/20/pacote-dictionaryIBGE.html) that already comes with the dictionaries of several PNAD’s until 2012. I had included the last dictionary of 2012, but as soon as I can I’ll do a pull request to include the PNAD 2013 dictionary as well.

  • sorry, this may speed up the import a little, but it fails to merge or post-stratification the weights or remove the missing values. a complete example should replicate an ibge variation coefficient.

  • @Anthonydamico, see that the purpose of the answer I gave was to show a procedure for reading the PNAD data, without using Ibgepesq. Anyway in your scripts are available these next steps and maybe the ideal would be to join this form of reading to the post treatment you did. As soon as I have time I will try to do that. But particularly for that answer I did not find it necessary to do so. Anyway thanks for the suggestion.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.