3
William, what you want to do is pass the date.frame format long for the format wide.
Constructing the variable col
as the ID change would be as follows.
Playing your original date.frame:
df <- data.frame(ID = c(rep(313721, 3),
rep(313718, 2),
rep(313729, 6)),
COD_pr = c(205073,
176779,
191991,
198089,
201429,
167489,
119926,
170093,
170363,
123486,
158028))
Calculating column by ID change.
df$col <- unlist(sapply(rle(df$ID)$lengths, seq, from = 1))
Moving data.frame to wide format.
library(tidyr)
spread(df, col, COD_pr)
ID 1 2 3 4 5 6
1 313718 198089 201429 NA NA NA NA
2 313721 205073 176779 191991 NA NA NA
3 313729 167489 119926 170093 170363 123486 158028
library(reshape2)
dcast(df, ID~col, value.var = "COD_pr")
ID 1 2 3 4 5 6
1 313718 198089 201429 NA NA NA NA
2 313721 205073 176779 191991 NA NA NA
3 313729 167489 119926 170093 170363 123486 158028
Previous answer
However, in the example you gave, the variable "col" would be missing, that is, we need to know how to define in which column each variable Cod_pr enters so we can distribute the values the way you want.
Assuming this "col" variable exists, it is easy to do this with both the tidyr
as to the reshape2
. See this illustrative example, similar to your data:
set.seed(10)
df <- data.frame(ID = rep(c(1,2,3), 4),
col = c(rep("col1", 3),
rep("col2", 2),
rep("col3", 1),
rep("col4", 1),
rep("col5", 2),
rep("col6", 3)),
COD_pr = rnorm(12))
library(tidyr)
spread(df, col, COD_pr)
ID col1 col2 col3 col4 col5 col6
1 1 0.01874617 -0.5991677 NA -1.208076 NA -0.2564784
2 2 -0.18425254 0.2945451 NA NA -0.363676 1.1017795
3 3 -1.37133055 NA 0.3897943 NA -1.626673 0.7557815
library(reshape2)
dcast(df, ID~col, value.var = "COD_pr")
ID col1 col2 col3 col4 col5 col6
1 1 0.01874617 -0.5991677 NA -1.208076 NA -0.2564784
2 2 -0.18425254 0.2945451 NA NA -0.363676 1.1017795
3 3 -1.37133055 NA 0.3897943 NA -1.626673 0.7557815
William, welcome to Sopt. Why do you need the data in this format? In the R pattern, the columns represent variables, and the observation lines. If the number of
COD_pr
varies for eachID
, Probably the best would be to leave in the format that is already and work like this. Another possibility would be to join everyone in a column, but it is not a pleasant organization. Doing what you want is possible, but will probably cause you problems in the future, because apparently your columns have no real meaning.– Molx
William, is missing the variable indicating in which column to place the
COD_pr
, providing this variable is simple to do what you want, as answered below.– Carlos Cinelli
William, I put an answer that calculates the column according to the change of ID and then passes the date.frame to the wide format.
– Carlos Cinelli
The name of it is pivot, see this response
– Ivan Ferrer