How to pass my data set from wide format to long cm multiple variables in R

Asked

Viewed 170 times

2

I have the following data frame:

dput(exemplo)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7),
    grupo = c("A", "A", "A", "B", "B", "B", NA),
    fc_pre = c(90, 98, 77, 98, 100, 92, 89),
    fc_pos = c(70, 77, 77, 70, 79, 72, 76),
    pa_pre = c(130, 140, 160, 160, 120, 120, 150),
    pa_pos = c(120, 110, 140, 150, 130, 120, 130)),
   .Names = c("id", "grupo", "fc_pre", "fc_pos", "pa_pre", "pa_pos"), 
    class = c("tbl_df", "tbl", "data.frame"),
    row.names = c(NA, -7L))

I put it in long format using the code:

    library(reshape2)
    longo <- melt(exemplo, id=c("id", "grupo"))

My data frame was in the long format. But, R put fc_pre a fc_pos a pa_pre and pa_pos in the same vector.

    longo
    id grupo variable value
 1   1     A   fc_pre    90
 2   2     A   fc_pre    98
 3   3     A   fc_pre    77
 4   4     B   fc_pre    98
 5   5     B   fc_pre   100
 15  1     A   pa_pre   130
 16  2     A   pa_pre   140
 17  3     A   pa_pre   160
 18  4     B   pa_pre   160
 19  5     B   pa_pre   120

I’d like it to be this way:

 # A tibble: 14 x 5
       id grupo tempo     fc     pa
    <dbl> <chr> <chr>  <dbl>  <dbl>
  1     1     A   pre     90    130
  2     2     A   pre     98    140
  3     3     A   pre     77    160
  4     4     B   pre     98    160
  5     5     B   pre    100    120
  6     6     B   pre     92    120
  8     1     A   pos     70    120
  9     2     A   pos     77    110
 10     3     A   pos     77    140
 11     4     B   pos     70    150
 12     5     B   pos     79    130

See that now I have a vector called time, another called Fc and another called pa. Would anyone know how to restructure my data frame to look like this?

1 answer

2


Hello, to do this task it is necessary to combine primarily the functions to melt and the dcast package reshape2 along with the function sub.

require(reshape2)
longo <- melt(df, id.vars=c("id","grupo")) # Passo as variáveis do formato wide para longo
longo$tempo <- factor(with(longo, sub(".*_","",variable))) #Crio a variável tempo
longo$variavel <- factor(with(longo, sub("_.*","",variable))) #Crio a variável em questão
longo <- longo[,-3]
longo <- dcast(longo, id + grupo + tempo ~ variavel) # Volto as variáveis que ficaram no formato "wide" para longo.

After that, the dataframe will look like this:

  id grupo tempo fc  pa
1  1     A   pos 70 120
2  1     A   pre 90 130
3  2     A   pos 77 110
4  2     A   pre 98 140
5  3     A   pos 77 140
6  3     A   pre 77 160

Note that for this code to work, it is necessary that the columns in the original dataframe (df) obey some pattern so that the function sub() is able to correctly separate the variable name from the time it was measured.

  • Thank you Guilherme Parreira, it worked here. I managed to reproduce the example. I just need to understand some functions. Thank you

Browser other questions tagged

You are not signed in. Login or sign up in order to post.