6
I have a data set and would like to select only the smallest date among each primary key (column MATRICULA
). Follow the example of my DF:
MATRICULA <- c(1,1,3,3,3,4,5,5,5,5,6)
DATA <- c('15/01/2018', '10/12/2017', '20/11/2017', '01/01/2015',
'25/10/2018', '02/07/2016', '03/12/2016','17/08/2017', '22/03/2018',
'12/06/2018', '13/04/2014')
DADOS <- data.frame(MATRICULA, DATA)
I already use the function abv_data = c(as.Date(DADOS$DATA,"%d/%m/%Y"))
to transform the date format.
Therefore, I would like the result to appear only at the earliest date from the column MATRICULA
. The result I expect should be:
MATRICULA <- c(1,3,4,5,6)
DATA <- c('10/12/2017', '01/01/2015', '02/07/2016', '03/12/2016', '13/04/2014')
DADOS <- data.frame(MATRICULA,DATA)
It worked better that way. As I have a DF of millions of lines, by Aggregate I did not succeed.
– Bruno Avila