1
Consider the data.frame:
df_1 <- data.frame(
a = replicate(6, runif(30, 20, 100)),
b = rep(c(LETTERS[1:5]), times = 1, each = 6)
)
Use of gather:
library(tidyverse)
library(magrittr)
df_1 %<>% as_tibble
x <- df_1 %>%
select_at(vars(num_range('a.', 1:3))) %>%
gather(key = 'factors', value = 'case') %>%
print()
# A tibble: 90 x 2
factors case
<chr> <dbl>
1 a.1 91.0
2 a.1 56.2
3 a.1 34.0
4 a.1 85.1
5 a.1 66.2
6 a.1 21.7
7 a.1 29.8
8 a.1 80.3
9 a.1 59.8
10 a.1 85.4
# … with 80 more rows
Use of spread to return to the original data:
y <- x %>%
spread(key = factors, value = case)
Error: Duplicate Identifiers for Rows (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30), (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60), (61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90) Call
rlang::last_error()to see a backtrace
To solve this (in part), I used mutate with row_number():
y <- x %>%
mutate(n = row_number()) %>%
spread(key = factors, value = case) %>%
print()
# A tibble: 90 x 4
n a.1 a.2 a.3
<int> <dbl> <dbl> <dbl>
1 1 91.0 NA NA
2 2 56.2 NA NA
3 3 34.0 NA NA
4 4 85.1 NA NA
5 5 66.2 NA NA
6 6 21.7 NA NA
7 7 29.8 NA NA
8 8 80.3 NA NA
9 9 59.8 NA NA
10 10 85.4 NA NA
# … with 80 more rows
The three columns are returned, but the cases do not match (that is, next to each value, there is a missing data - NA). How do I adjust this with some function of tidyverse so as to leave mine data.frame with 30 lines and not 90?
It has to include a
group_by(factor)before themutate(n = row_number()).– Tomás Barcellos