Line operations of a Dataframe in R

Asked

Viewed 190 times

2

I started using R recently. I would like to subtract corresponding lines from a Dataframe:

Dados de exemplo

The numbers in the ID column are fixed, but the data in the VALUE column is not. The idea would be to subtract the values linked to Ids 1 and 4, 2 and 5, 3 and 6 as shown in the table image. In this case, a table of N rows would result in a new column of N/2 rows.

If there’s any way to do it for Dplyr it would be fantastic.

Following are data that can be used to reproduce the problem.

set.seed(37)
dados <- tibble::tibble(id = 1:6, valor = rnorm(6, 100, 20))

Thank you

1 answer

4


Two possible ways to solve this problem.

First, let’s create the data:

set.seed(37)
library(dplyr)
dados <- tibble::tibble(id = 1:6, valor = rnorm(6, 100, 20))

First: create a variable that puts the values that must be subtracted on the same line in order to relate them:

dados %>% 
  mutate(valor2 = lag(valor, nrow(dados)/2),
         dif = valor2 - valor)
#> # A tibble: 6 x 4
#>      id valor valor2   dif
#>   <int> <dbl>  <dbl> <dbl>
#> 1     1 102.     NA  NA   
#> 2     2 108.     NA  NA   
#> 3     3 112.     NA  NA   
#> 4     4  94.1   102.  8.37
#> 5     5  83.4   108. 24.2 
#> 6     6  93.3   112. 18.2 

You can add filter(!is.na(dif)) if you want the NAs don’t show up.

The second way is to create a variable that relates the cases (call here the pair), and then use the summarise to make the calculation.

dados %>% 
  mutate(par = rep(seq_len(nrow(dados)/2), 2)) %>% 
  group_by(par) %>% 
  summarise(ids = paste(id, collapse = " - "),
            dif = diff(rev(valor)))
#> # A tibble: 3 x 3
#>     par ids     dif
#>   <int> <chr> <dbl>
#> 1     1 1 - 4  8.37
#> 2     2 2 - 5 24.2 
#> 3     3 3 - 6 18.2 
  • Fantastic. It worked fine. My DF has more than 20k lines. The proposed structure will help me fine tune other codes I have. Thank you so much for your help!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.