The mistake
The error message indicates that the function apply()
, call for f2()
is being rotated into an object that does not have two dimensions. This is because the mutate will try to apply the function in each of the columns, which in fact does not have two dimensions.
The solution
Executing line operations is a non-trivial issue within the tidyverse. This is because this package/philosophy was designed to work with tables in long format and by groups.
The biggest proof of this is that there have been efforts by three major developers of tidyverse
to attack that question. Hadley Wickham created purrrlyr, Jenny Bryan dealt with the theme here (and mainly here) and Romain François himself, current maintainer of dplyr, recently created this package.
The answer I offer then is to use the purrr::transpose()
to resolve the issue.
The purrr offers the function transpose
which makes lista[[1]]][[2]]
in lista[[2]][[1]]
. Using this function we can create a coluna-lista
for each line.
tidy_data <- data_1 %>%
as_tibble() %>%
mutate(linhas = transpose(data_1) %>% map(unlist))
tidy_data
# A tibble: 30 x 7
var1 var2 var3 var4 var5 var6 linhas
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list>
1 29.1 56.5 89.2 33.3 79.5 55.1 <dbl [6]>
2 69.8 41.2 23.3 92.0 93.3 38.3 <dbl [6]>
3 68.7 44.4 45.4 30.7 99.6 26.6 <dbl [6]>
4 69.9 60.6 21.1 30.5 95.4 88.0 <dbl [6]>
5 88.9 34.5 39.1 28.4 58.9 38.8 <dbl [6]>
6 71.2 80.8 76.5 60.9 42.7 99.1 <dbl [6]>
7 20.8 36.1 44.6 44.0 40.1 68.2 <dbl [6]>
8 38.6 40.7 60.7 22.1 60.3 99.9 <dbl [6]>
9 73.3 99.4 24.1 44.8 59.8 50.0 <dbl [6]>
10 61.1 84.6 65.2 79.4 45.5 64.4 <dbl [6]>
# ... with 20 more rows
After this done, just apply your function to each line with mutate()
+ map()
.
tidy_data %>%
mutate(estats = map(linhas, f1))
# A tibble: 30 x 8
var1 var2 var3 var4 var5 var6 linhas estats
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <list>
1 29.1 56.5 89.2 33.3 79.5 55.1 <dbl [6]> <dbl [3]>
2 69.8 41.2 23.3 92.0 93.3 38.3 <dbl [6]> <dbl [3]>
3 68.7 44.4 45.4 30.7 99.6 26.6 <dbl [6]> <dbl [3]>
4 69.9 60.6 21.1 30.5 95.4 88.0 <dbl [6]> <dbl [3]>
5 88.9 34.5 39.1 28.4 58.9 38.8 <dbl [6]> <dbl [3]>
6 71.2 80.8 76.5 60.9 42.7 99.1 <dbl [6]> <dbl [3]>
7 20.8 36.1 44.6 44.0 40.1 68.2 <dbl [6]> <dbl [3]>
8 38.6 40.7 60.7 22.1 60.3 99.9 <dbl [6]> <dbl [3]>
9 73.3 99.4 24.1 44.8 59.8 50.0 <dbl [6]> <dbl [3]>
10 61.1 84.6 65.2 79.4 45.5 64.4 <dbl [6]> <dbl [3]>
# ... with 20 more rows
The above solution leaves the result in a column-list, in case you get uncomfortable with them we can expand the mutate
and we will have
tidy_data %>%
mutate(s = map_dbl(linhas, sum),
m = map_dbl(linhas, mean),
v = map_dbl(linhas, sd))
# A tibble: 30 x 10
var1 var2 var3 var4 var5 var6 linhas s m v
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <dbl> <dbl> <dbl>
1 29.1 56.5 89.2 33.3 79.5 55.1 <dbl [6]> 343. 57.1 24.0
2 69.8 41.2 23.3 92.0 93.3 38.3 <dbl [6]> 358. 59.7 29.7
3 68.7 44.4 45.4 30.7 99.6 26.6 <dbl [6]> 315. 52.6 27.4
4 69.9 60.6 21.1 30.5 95.4 88.0 <dbl [6]> 365. 60.9 30.0
5 88.9 34.5 39.1 28.4 58.9 38.8 <dbl [6]> 289. 48.1 22.4
6 71.2 80.8 76.5 60.9 42.7 99.1 <dbl [6]> 431. 71.9 19.0
7 20.8 36.1 44.6 44.0 40.1 68.2 <dbl [6]> 254. 42.3 15.4
8 38.6 40.7 60.7 22.1 60.3 99.9 <dbl [6]> 322. 53.7 26.9
9 73.3 99.4 24.1 44.8 59.8 50.0 <dbl [6]> 351. 58.6 25.8
10 61.1 84.6 65.2 79.4 45.5 64.4 <dbl [6]> 400. 66.7 13.9
# ... with 20 more rows
Another possible solution would be to play the table in a long format, group the data by rows and create a summary with the statistics.
These solutions are more robust than transforming the table into a matrix because data coercion can occur in the matrix to character
if there is such a column in the table.