2
I am starting in R, and I have a data frame similar to this one below:
x <- data.frame(cod_produto = c(1,1,1,2,2,2),
valordia = c(0,0,150.23,110.98,18.65,0),
data = c("2019-01-01","2019-01-02","2019-01-03","2019-01-01","2019-01-02","2019-01-03"))
How to return the first day the sale took place (valordia > 0
)?
I tried to use the package dplyr
.
Thanks for the return. There is possibility to use within a Summarize?
– Ceconello
How would the use (for example, add the
valordia
)?– neves
Something similar to the code below. Somando o valor total das vendas e incluindo dois campos com a data da primeira e da última venda:

y <- x %>% 
 dplyr::group_by(cod_produto) %>%
 dplyr::summarize(totalVenda = sum(valordia),
 dataPrimeiraVenda = ??? , dataUltimaVenda = ???)
– Ceconello
Instead of
summarise
which returns a simple summary of an operation, usemutate
. Make sure you answer to what you want:y <- x %>% 
 group_by(cod_produto) %>% 
 mutate(totalVenda = sum(valordia)) %>% 
 arrange(data) %>% 
 slice(c(1, n()))
– neves
The code you created returns two lines for each product (first date and last date). However, what I need is to group by product code (that’s why I was using the
summarise
). The result would look something like this (in single record for each product):cod_produto
,totalVenda
,dataPrimeiraVenda
,dataUltimaVenda
. The dates of the first and last sale must be by fieldvalordia > 0
– Ceconello
Do you want the date in separate columns? This is,
dataPrimeiraVenda
anddataUltimaVenda
and, if thevalordia > 0
? If so, it follows:library(tidyr)

x %>% group_by(cod_produto) %>% 
 mutate(totalVenda = sum(valordia)) %>% 
 arrange(data) %>% 
 slice(c(1, n())) %>% 
 spread(key = data, value = totalVenda) %>% 
 filter(valordia > 0)
. ThetotalVenda
is inside the dated columns.– neves
For the question date frame, the result should be like this:
structure(data.frame(cod_produto = c(1,2),totalVenda = c(150.23,129.63), dataPrimeiraVenda = c("2019-01-03","2019-01-01"), dataUltimaVenda = c("2019-01-03","2019-01-02")))
– Ceconello