1
Consider the data.frame
:
df_1 <- data.frame(
a = replicate(6, runif(30, 20, 100)),
b = rep(c(LETTERS[1:5]), times = 1, each = 6)
)
Use of gather
:
library(tidyverse)
library(magrittr)
df_1 %<>% as_tibble
x <- df_1 %>%
select_at(vars(num_range('a.', 1:3))) %>%
gather(key = 'factors', value = 'case') %>%
print()
# A tibble: 90 x 2
factors case
<chr> <dbl>
1 a.1 91.0
2 a.1 56.2
3 a.1 34.0
4 a.1 85.1
5 a.1 66.2
6 a.1 21.7
7 a.1 29.8
8 a.1 80.3
9 a.1 59.8
10 a.1 85.4
# … with 80 more rows
Use of spread
to return to the original data:
y <- x %>%
spread(key = factors, value = case)
Error: Duplicate Identifiers for Rows (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30), (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60), (61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90) Call
rlang::last_error()
to see a backtrace
To solve this (in part), I used mutate
with row_number()
:
y <- x %>%
mutate(n = row_number()) %>%
spread(key = factors, value = case) %>%
print()
# A tibble: 90 x 4
n a.1 a.2 a.3
<int> <dbl> <dbl> <dbl>
1 1 91.0 NA NA
2 2 56.2 NA NA
3 3 34.0 NA NA
4 4 85.1 NA NA
5 5 66.2 NA NA
6 6 21.7 NA NA
7 7 29.8 NA NA
8 8 80.3 NA NA
9 9 59.8 NA NA
10 10 85.4 NA NA
# … with 80 more rows
The three columns are returned, but the cases do not match (that is, next to each value, there is a missing data - NA
). How do I adjust this with some function of tidyverse
so as to leave mine data.frame
with 30 lines and not 90?
It has to include a
group_by(factor)
before themutate(n = row_number())
.– Tomás Barcellos