Reading Archives in ASCII Census 2010

Asked

Viewed 1,112 times

2

Does anyone know where to download ASCII (.sas) files for reading the IBGE 2010 Demographic Census microdata ?

I know that the Anthony Damico keeps only a few files on his website (see below how to download), but I am looking for the files available by IBGE itself. Damico does not provide, for example, the reading file of the mortality database.

   # download arquivo SAS de pessoas

     download.file( "https://raw.github.com/ajdamico/asdfree/master/Censo%20Demografico/SASinputPes.txt" , "LEPESSOAS.sas" )

ps. No IBGE website/2010 census It is possible to download the microdata and documentation, but there is no information about the reading files in SAS

UPDATE (02 Oct 2015)

I confirmed the @Rcoster response with two IBGE researchers. IBGE does not provide the SAS reading files on the website. I followed @Rcoster’s suggestion and created a script that:

  • download data and documentation from the 2010 census
  • uses the excel variable dictionary to build the base reading file in .txt and convert to format data.table
  • saves the bases in .csv

The script is very fast and is available here. Suggestions are welcome.

  • The census microdata are in CSV or . xls, so the reading is done directly (read.xsl() or read.csv2()). Or you want the sample data?

  • I want to read the sample data. Thank you for the information

2 answers

2

These files are not made available by IBGE. What IBGE makes available is a file with layout of each bank (Layout / Layout_microdados_sample.xls), which allows you to make your own syntax.

0


Complementing the @Rcoster response, I present here an alternative solution in which the syntax of reading the data .txt from the variable dictionary file .xls.

# Load libraries
  library(data.table)
  library(readxl)


# Abre arquivo Excel com dicionario de variaveis
  dic_dom <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =1, skip = 1)
  dic_pes <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =2, skip = 1)
  dic_mor <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =4, skip = 1)

# converte para data.table
  setDT(dic_dom)
  setDT(dic_pes)
  setDT(dic_mor)

# Cria funcao para computar largura das variaveis, e muda nome de posicao inicial e final
  computeWidth <- function(dataset){dataset[is.na(DEC), DEC := 0]
                                    dataset[, width := INT + DEC]
                                    setnames(dataset,colnames(dataset)[3],"pos.ini")
                                    setnames(dataset,colnames(dataset)[4],"pos.fin")
                                    }



# Aplica funcao
  lapply(list(dic_dom,dic_pes,dic_mor), computeWidth)

This code was taken from this scritp here, which downloads and reads the basis of the 2010 census

Browser other questions tagged

You are not signed in. Login or sign up in order to post.