0
Initially I would like to point out that the ideal is always to ask questions with reproducible examples. In your case you should have provided the date.frame dice that I ended up having to type ;-). For you to better understand how to ask a question with a reproducible example read this help: How to create a Minimum, Complete and Verifiable example
In the first part I’m simply creating a data.frame like the one you provided in the image.
## Criando o exemplo como um data.frame
dados <- data.frame(
Processo = c(201701, 201701, 201702, 201702, 201702, 201703, 201703, 201704, 201704, 201704),
Grupo = c('A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'A', 'A'),
Data = c('01/02/2017', '15/02/2017', '20/03/2017', '18/04/2017', '01/07/2017', '15/02/2017', '20/02/2017', '01/03/2017', NA, '05/06/2017')
)
Something important you need to know about R is that when reading a dataset with dates R will initially "understand" these dates as strings. You will need to convert these strings to R date format such that you can do sum and subtraction operations with dates:
## Convertendo para data
dados$Data <- as.Date(dados$Data, format = '%d/%m/%Y')
See that I provided an argument format which shows R how days, months and year are represented. I used uppercase Y because the year is displayed with 4 digits.
Finally, just use dplyr to group and then calculate the difference between the longest and shortest date. Note that I used the na.rm = T option to remove the NA.
## Carregando o pacote dplyr
library(dplyr)
## Agrupando e calculando a diferença entre as datas com o dplyr
dados %>%
group_by(Processo, Grupo) %>%
arrange(desc(Data)) %>%
summarise(Total_Dias = max(Data, na.rm = T) - min(Data, na.rm = T))
The result is exactly the final table you posted:
# A tibble: 4 x 3
# Groups: Processo [?]
Processo Grupo Total_Dias
<dbl> <fct> <time>
1 201701. A 14
2 201702. B 103
3 201703. C 5
4 201704. A 96