How to transpose rows into columns in a data frame?

Question

How to transpose rows into columns in a data frame?

Asked 7 years, 3 months ago

Viewed 3,944 times

2

I have a time series with daily precipitation data between 1961 and 2017, as an example below. I need to group these data by month over the years. I was able to transpose the "month" column of the data frame and fill in the values of the "Prec" column, using the function dcast, like the question in 1.

ID dia mes ano prec 1 21 ago 1961 NA 2 22 ago 1961 0.00 3 23 ago 1961 1.00 4 24 ago 1961 0.00 5 25 ago 1961 0.00 6 26 ago 1961 0.00 7 27 ago 1961 0.00 8 28 ago 1961 0.00 9 29 ago 1961 0.00 10 30 ago 1961 NA 11 31 ago 1961 0.00 12 1 set 1961 0.00 13 2 set 1961 0.00 14 3 set 1961 0.00 15 4 set 1961 0.00 16 5 set 1961 0.00 17 6 set 1961 0.00 18 7 set 1961 0.00 19 8 set 1961 NA 20 9 set 1961 0.00 21 10 set 1961 0.00 22 11 set 1961 0.00 23 12 set 1961 0.00 24 13 set 1961 0.00 25 14 set 1961 0.00 26 15 set 1961 0.00 27 16 set 1961 0.00 28 17 set 1961 0.00 29 18 set 1961 0.00 30 19 set 1961 0.00 31 20 set 1961 0.00 32 21 set 1961 0.00 33 22 set 1961 0.00 34 23 set 1961 0.00 35 24 set 1961 0.00 36 25 set 1961 0.00 37 26 set 1961 0.00 38 27 set 1961 0.00 39 28 set 1961 0.00 40 29 set 1961 0.00 41 30 set 1961 0.00

However, the resulting data frame does not present the consecutive precipitation values for each month, but rather blocks of values as a function of the table ID, as shown schematically below.

ID  jan  fev  ...  ago  set  ...  dez
1                  Na           
2                  0
3                  1
.                  .
.                  .
.                  .
11                 0
12                      0
.                       .
.                       .
.                       .
41                      0
42                            0

How can I solve this problem?

2 answers

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Marcos Banik • **1,779** points · Answer 1 · 2018-04-01T22:04:34+00:00

I’m not sure if it’s what you want, but you can try using the library tidyr

library(tidyr)
spread(df, mes, prec)

   ID dia  ano ago set
1   1  21 1961  NA  NA
2   2  22 1961   0  NA
3   3  23 1961   1  NA
4   4  24 1961   0  NA
5   5  25 1961   0  NA
6   6  26 1961   0  NA
7   7  27 1961   0  NA
8   8  28 1961   0  NA
9   9  29 1961   0  NA
10 10  30 1961  NA  NA
11 11  31 1961   0  NA
12 12   1 1961  NA   0
13 13   2 1961  NA   0
14 14   3 1961  NA   0
15 15   4 1961  NA   0
16 16   5 1961  NA   0
17 17   6 1961  NA   0
18 18   7 1961  NA   0
19 19   8 1961  NA  NA
20 20   9 1961  NA   0
21 21  10 1961  NA   0
22 22  11 1961  NA   0
23 23  12 1961  NA   0
24 24  13 1961  NA   0
25 25  14 1961  NA   0
26 26  15 1961  NA   0
27 27  16 1961  NA   0
28 28  17 1961  NA   0
29 29  18 1961  NA   0
30 30  19 1961  NA   0
31 31  20 1961  NA   0
32 32  21 1961  NA   0
33 33  22 1961  NA   0
34 34  23 1961  NA   0
35 35  24 1961  NA   0
36 36  25 1961  NA   0
37 37  26 1961  NA   0
38 38  27 1961  NA   0
39 39  28 1961  NA   0
40 40  29 1961  NA   0
41 41  30 1961  NA   0

by Rui Barradas • **15,422** points · Answer 2 · 2018-04-02T08:58:29+00:00

I believe that the simplest form is still with the base R function xtabs.

result <- xtabs( ~ ID + mes, dados)
head(result)
#   mes
#ID  ago set
#  1   1   0
#  2   1   0
#  3   1   0
#  4   1   0
#  5   1   0
#  6   1   0

Another way, using only R base may be

reshape(dados[c(1, 3, 5)], idvar = "ID", timevar = "mes", direction = "wide")

And with the package reshape2,

reshape2::dcast(dados, ID ~ mes)

DICE.

dados <-
structure(list(ID = 1:41, dia = c(21L, 22L, 23L, 24L, 25L, 26L, 
27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L), mes = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("ago", "set"), class = "factor"), 
    ano = c(1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 
    1961L, 1961L, 1961L, 1961L, 1961L, 1961L, 1961L), prec = c(NA, 
    0, 1, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0)), .Names = c("ID", "dia", "mes", "ano", "prec"), class = "data.frame", row.names = c(NA, 
-41L))