Delete from duplicate records

Asked

Viewed 47 times

0

I have a basis where the records exist in the duplicate identifier number, in some scenarios the date of the newer record differs from the old one, but the number and the same (I have to delete the record with the smaller date) in other cases the dates and numbers of the identifier are equal (I have to keep only one record). I am using the data below, but duplicate records are not being erased.

Deletes:

1 - If I use this erases all records.

DELETE RES1 FROM REGISTRO AS RES1, REGISTRO AS RES2 WHERE RES1.HR_TRA <= RES2.HR_TRA AND RES1.NUMERO = RES2.NUMERO 
AND RES1.DT_DATA <= RES2.DT_DATA;

2 - If you use this, kill duplicate records.

DELETE RES1 FROM REGISTRO AS RES1, REGISTRO AS RES2 WHERE RES1.HR_TRA < RES2.HR_TRA AND RES1.NUMERO = RES2.NUMERO 
AND RES1.DT_DATA <= RES2.DT_DATA;

Creation and base population:

-- CRIANDO BASE DE TESTE REGISTROS DUPLICADOS

CREATE DATABASE TESTE;
USE TESTE;
CREATE TABLE REGISTRO 
(
ID INT PRIMARY KEY AUTO_INCREMENT,
NUMERO INT(10),
NOME VARCHAR(20),
VALOR INT(10),
DATA_ATU DATETIME,
HR_TRA TIME,
DT_DATA DATE
);

INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('2','TESTE', '200', '2020/02/04 17:06:06', '17:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('2','TESTE', '200', '2020/02/04 17:06:06', '17:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('2','TESTE', '200', '2020/02/04 16:06:06', '16:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('2','TESTE', '200', '2020/02/04 15:06:06', '15:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('2','TESTE', '200', '2020/02/04 15:06:06', '15:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('1','TESTE', '200', '2020/02/04 15:06:06', '15:06:06', '2020/02/04');
INSERT INTO REGISTRO (NUMERO, NOME, VALOR, DATA_ATU, HR_TRA, DT_DATA) VALUES ('3','TESTE', '200', '2020/02/04 15:06:06', '15:06:06', '2020/02/04');
  • What would be the expected result?

  • I believe you have understood what you need, you will need support from a mysql cursor or a language to loop the duplicate records. Only with the delete command do not know if you can do what you need.

  • I think it should also work, thank you!!!

1 answer

0


This command seems to do what you want:

delete res1 from REGISTRO res1
inner join REGISTRO res2
where res1.numero = res2.numero and 
   (res1.dt_data < res2.dt_data or 
        (res1.dt_data = res2.dt_data and 
         res1.id < res2.id)
    );

The trick was to use the "id" column in addition to the "dt_data" column. When two lines have the same date, then we choose the smaller id to delete.

  • I performed the test here, super worked. Thank you!!

  • Blz, don’t forget to mark the answer as the correct one, which yields me a few points :D

  • 1

    Ready, marked as right!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.