How to transpose rows into columns in a data frame?

Asked

Viewed 3,944 times

2

I have a time series with daily precipitation data between 1961 and 2017, as an example below. I need to group these data by month over the years. I was able to transpose the "month" column of the data frame and fill in the values of the "Prec" column, using the function dcast, like the question in 1.

ID dia mes ano prec 1 21 ago 1961 NA 2 22 ago 1961 0.00 3 23 ago 1961 1.00 4 24 ago 1961 0.00 5 25 ago 1961 0.00 6 26 ago 1961 0.00 7 27 ago 1961 0.00 8 28 ago 1961 0.00 9 29 ago 1961 0.00 10 30 ago 1961 NA 11 31 ago 1961 0.00 12 1 set 1961 0.00 13 2 set 1961 0.00 14 3 set 1961 0.00 15 4 set 1961 0.00 16 5 set 1961 0.00 17 6 set 1961 0.00 18 7 set 1961 0.00 19 8 set 1961 NA 20 9 set 1961 0.00 21 10 set 1961 0.00 22 11 set 1961 0.00 23 12 set 1961 0.00 24 13 set 1961 0.00 25 14 set 1961 0.00 26 15 set 1961 0.00 27 16 set 1961 0.00 28 17 set 1961 0.00 29 18 set 1961 0.00 30 19 set 1961 0.00 31 20 set 1961 0.00 32 21 set 1961 0.00 33 22 set 1961 0.00 34 23 set 1961 0.00 35 24 set 1961 0.00 36 25 set 1961 0.00 37 26 set 1961 0.00 38 27 set 1961 0.00 39 28 set 1961 0.00 40 29 set 1961 0.00 41 30 set 1961 0.00

However, the resulting data frame does not present the consecutive precipitation values for each month, but rather blocks of values as a function of the table ID, as shown schematically below.

ID  jan  fev  ...  ago  set  ...  dez
1                  Na           
2                  0
3                  1
.                  .
.                  .
.                  .
11                 0
12                      0
.                       .
.                       .
.                       .
41                      0
42                            0

How can I solve this problem?

2 answers

1

I’m not sure if it’s what you want, but you can try using the library tidyr

library(tidyr)
spread(df, mes, prec)

   ID dia  ano ago set
1   1  21 1961  NA  NA
2   2  22 1961   0  NA
3   3  23 1961   1  NA
4   4  24 1961   0  NA
5   5  25 1961   0  NA
6   6  26 1961   0  NA
7   7  27 1961   0  NA
8   8  28 1961   0  NA
9   9  29 1961   0  NA
10 10  30 1961  NA  NA
11 11  31 1961   0  NA
12 12   1 1961  NA   0
13 13   2 1961  NA   0
14 14   3 1961  NA   0
15 15   4 1961  NA   0
16 16   5 1961  NA   0
17 17   6 1961  NA   0
18 18   7 1961  NA   0
19 19   8 1961  NA  NA
20 20   9 1961  NA   0
21 21  10 1961  NA   0
22 22  11 1961  NA   0
23 23  12 1961  NA   0
24 24  13 1961  NA   0
25 25  14 1961  NA   0
26 26  15 1961  NA   0
27 27  16 1961  NA   0
28 28  17 1961  NA   0
29 29  18 1961  NA   0
30 30  19 1961  NA   0
31 31  20 1961  NA   0
32 32  21 1961  NA   0
33 33  22 1961  NA   0
34 34  23 1961  NA   0
35 35  24 1961  NA   0
36 36  25 1961  NA   0
37 37  26 1961  NA   0
38 38  27 1961  NA   0
39 39  28 1961  NA   0
40 40  29 1961  NA   0
41 41  30 1961  NA   0

1

I believe that the simplest form is still with the base R function xtabs.

result <- xtabs( ~ ID + mes, dados)
head(result)
#   mes
#ID  ago set
#  1   1   0
#  2   1   0
#  3   1   0
#  4   1   0
#  5   1   0
#  6   1   0

Another way, using only R base may be

reshape(dados[c(1, 3, 5)], idvar = "ID", timevar = "mes", direction = "wide")

And with the package reshape2,

reshape2::dcast(dados, ID ~ mes)

DICE.

dados <-
structure(list(ID = 1:41, dia = c(21L, 22L, 23L, 24L, 25L, 26L, 
27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L), mes = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("ago", "set"), class = "factor"), 
    ano = c(1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L), prec = c(NA, 
    0, 1, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0)), .Names = c("ID", "dia", "mes", "ano", "prec"), class = "data.frame", row.names = c(NA, 
-41L))
  • The xtabs function worked in part because the lines containing NA have been deleted and I need them maintained. Is there any way to do that? I tried to follow the example of help, but it didn’t work either.

  • Actually, I noticed that the function xtabs gives me the frequency of measurements. Is that msm? While I need a function that transfers rain values to the columns, each column being a month of the year.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.