How to turn rows into columns, when the number of rows is variable in R

Asked

Viewed 3,317 times

3

I have a data frame that way:

inserir a descrição da imagem aqui

And I need him to stay that way:

inserir a descrição da imagem aqui

The variable that defines the column can be calculated according to the image below inserir a descrição da imagem aqui

The sequence starts with each id change, in excel it would be easy but I want to learn how to do it in R.

  • William, welcome to Sopt. Why do you need the data in this format? In the R pattern, the columns represent variables, and the observation lines. If the number of COD_pr varies for each ID, Probably the best would be to leave in the format that is already and work like this. Another possibility would be to join everyone in a column, but it is not a pleasant organization. Doing what you want is possible, but will probably cause you problems in the future, because apparently your columns have no real meaning.

  • William, is missing the variable indicating in which column to place the COD_pr, providing this variable is simple to do what you want, as answered below.

  • William, I put an answer that calculates the column according to the change of ID and then passes the date.frame to the wide format.

  • The name of it is pivot, see this response

1 answer

3

William, what you want to do is pass the date.frame format long for the format wide.

Constructing the variable col as the ID change would be as follows.

Playing your original date.frame:

df <- data.frame(ID = c(rep(313721, 3),
                        rep(313718, 2),
                        rep(313729, 6)),
                 COD_pr = c(205073,
                            176779,
                            191991,
                            198089,
                            201429,
                            167489,
                            119926,
                            170093,
                            170363,
                            123486,
                            158028))

Calculating column by ID change.

df$col <- unlist(sapply(rle(df$ID)$lengths, seq, from = 1))

Moving data.frame to wide format.

library(tidyr)
spread(df, col, COD_pr)
      ID      1      2      3      4      5      6
1 313718 198089 201429     NA     NA     NA     NA
2 313721 205073 176779 191991     NA     NA     NA
3 313729 167489 119926 170093 170363 123486 158028

library(reshape2)
dcast(df, ID~col, value.var = "COD_pr")
      ID      1      2      3      4      5      6
1 313718 198089 201429     NA     NA     NA     NA
2 313721 205073 176779 191991     NA     NA     NA
3 313729 167489 119926 170093 170363 123486 158028

Previous answer

However, in the example you gave, the variable "col" would be missing, that is, we need to know how to define in which column each variable Cod_pr enters so we can distribute the values the way you want.

Assuming this "col" variable exists, it is easy to do this with both the tidyr as to the reshape2. See this illustrative example, similar to your data:

set.seed(10)
df <- data.frame(ID = rep(c(1,2,3), 4),
                 col = c(rep("col1", 3), 
                         rep("col2", 2),
                         rep("col3", 1), 
                         rep("col4", 1),
                         rep("col5", 2),
                         rep("col6", 3)),
                 COD_pr = rnorm(12))


library(tidyr)
spread(df, col, COD_pr)
  ID        col1       col2      col3      col4      col5       col6
1  1  0.01874617 -0.5991677        NA -1.208076        NA -0.2564784
2  2 -0.18425254  0.2945451        NA        NA -0.363676  1.1017795
3  3 -1.37133055         NA 0.3897943        NA -1.626673  0.7557815

library(reshape2)
dcast(df, ID~col, value.var = "COD_pr")
  ID        col1       col2      col3      col4      col5       col6
1  1  0.01874617 -0.5991677        NA -1.208076        NA -0.2564784
2  2 -0.18425254  0.2945451        NA        NA -0.363676  1.1017795
3  3 -1.37133055         NA 0.3897943        NA -1.626673  0.7557815

Browser other questions tagged

You are not signed in. Login or sign up in order to post.