Macro to open, stack and save several . sequential csv files in SPSS

Asked

Viewed 57 times

0

How to program a macro that:

  1. Only import a few columns (for example: ano, mes, id, valor and motivo ) of sequential . csv files for SPSS. These files would have type names: DADOS_JAN_2010, DADOS_FEV_2010, [...], until DADOS_DEZ_2019.

  2. Change the lengths of string variables id for (a11), motivo for (a32), where appropriate (necessary to reconcile the archives of the different years).

  3. Stack all these data into a new dataset/table with type name: DADOS_2010_2019.

If useful, an example of the data:

| ANO  | MÊS | ID          | VALOR | MOTIVO |
|------|-----|-------------|-------|--------|
| 2010 | 1   | 12345678910 | 1100  | promo  |
| 2010 | 1   | 12345678911 | 200   | faltas |
| 2010 | 1   | 12345678912 | 1450  | meta   |
| 2010 | 1   | 12345678913 | 400   | faltas |
| 2010 | 1   | 12345678914 | 1800  | meta   |
| 2010 | 1   | 12345678915 | 600   | padrao |
| 2010 | 1   | 12345678916 | 700   | padrao |
| 2010 | 1   | 12345678917 | 800   | meta   |
| 2010 | 1   | 12345678918 | 900   | meta   |
| 2010 | 1   | 12345678919 | 1000  | promo  |
| 2010 | 1   | 12345678920 | 800   | padrao |

For now, I’m opening, stacking and saving the files two by two. But this is repetitive and inefficient. In addition, if more variables need to be included in the future, the whole code needs to be redone. Therefore, I believe that a macro would be the smartest way to accomplish this task.

An example of part of my code so far:


GET DATA  /TYPE=TXT
  /FILE="C:\Users\luizz\DADOS\DADOS_JAN_2010.csv"
  /ENCODING='Locale'
  /DELCASE=LINE
  /DELIMITERS=";"
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES=
  ANO F4.0
  MES F1.0
  ID A11
  VALOR F4.0
  MOTIVO A8.
CACHE.
EXECUTE.
DATASET NAME JAN_2010 WINDOW=FRONT.

ALTER TYPE MOTIVO (a32).

GET DATA  /TYPE=TXT
  /FILE="C:\Users\luizz\DADOS\DADOS_FEV_2010.csv"
  /ENCODING='Locale'
  /DELCASE=LINE
  /DELIMITERS=";"
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES=
  ANO F4.0
  MES F1.0
  ID A11
  VALOR F4.0
  MOTIVO A8.
CACHE.
EXECUTE.
DATASET NAME FEV_2010 WINDOW=FRONT.

DATASET ACTIVATE FEV_2010.
ALTER TYPE MOTIVO (a32).

DATASET ACTIVATE JAN_2010.
ADD FILES /FILE=*
  /FILE='FEV_2010'.
EXECUTE.

SAVE OUTFILE='C:\Users\luizz\DADOS\DADOS_JAN_FEV_2010.sav'
  /COMPRESSED.

Someone would know how to program a macro for this in SPSS?

Thank you,

1 answer

1


Well, I asked the same question in stackoverflow in English and a user there @Li-k, knew how to answer. The link to the answer is here. And the credits of the code are from the quoted user.

I have friends who have experienced similar difficulties in importing sequential . csv data into SPSS. Several Brazilian government databases are available segmented in this way. Datasus hospital admissions data, for example, have the exact format described here. The RAIS labor market data is made available by UF and per year, so the logic is more or less the same. These data are too large to work on in R, so it may be useful to open them in SPSS. For these reasons, I decided to translate and share the answer to this stackoverflow question in English here:

Whereas the parameters for all files are equal to your question, you can use the following macro:

define !getfiles ()
!do !yr=2010 !to 2019
!do !mn !in("JAN FEV MAR ABR MAI JUN JUL AGO SET OUT NOV DEZ")
GET DATA 
  /TYPE=TXT /FILE=!concat('"C:\Users\luizz\DADOS\DADOS_', !mn, '_', !yr, '.csv"')
  /ENCODING='Locale'   /DELCASE=LINE   /DELIMITERS=";"   /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2   /IMPORTCASE=ALL   /VARIABLES= 
  YEAR F4.0
  MONTH F1.0
  ID A11
  VALUE F4.0
  MOTIVE A8.
CACHE.
EXECUTE.
ALTER TYPE id (a11) MOTIVE (a32).
dataset name tmp.
dataset activate gen.
add files /file=* /file=tmp.
exe.
!doend !doend
!enddefine.

The macro was set to read each of the files and add them to a main file. Before we call the macro, we need to create the main file:

data list list/YEAR (F4)  MONTH (F1) ID (A11) VALUE (F4) MOTIVE (A8).
begin data
end data.
exe.
dataset name gen.
* Agora chamamos a macro
!getfiles .
* Agora que os dados estão empilhados, podemos salvá-los
SAVE OUTFILE='C:\Users\luizz\DADOS\DATA_2010_2019.sav'  /COMPRESSED.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.