Select and delete duplicate records by last update date?

Asked

Viewed 1,722 times

1

Among the fields in the table there is the name field, the dt_add field which is the date that the record was inserted and dt_update which is the date of the last update of the record.

I would like to know how to return and delete repeated records, for example if there are 100 records with equal names, delete those with last update date less than 01/11/2017


Explaining better:

In a system a problem was found that there were as a consequence insertions let’s say of more than 16 thousand repetitions for about 100 records with the same name and value in a given table.

It seems surreal, but more than 16 thousand duplicated records for each value and the error occurred with about 100 values. The error was perceived after analysis due to high resource consumption of the server, and it was observed that the error has existed since 2014 due to dates of insertions and updates of the records.

What will be done is delete all those that are repeated with date less than 01/11/2017 in order to mitigate the consumption of resources and will be made the system correction.

I don’t know if I could be clear enough for the context.

In short I need an SQL select to return all the repeated records and a delete in order to delete repeated records with date less than 01/11/2017.


About primary key

It has primary key yes, but it is a table of settings where there should be unique records (if not this problem) and the system is based on the reference of the configuration name and value.

Example:

nome : habilitar-alertas 
valor: true 

nome : bloquear-clientes-pendentes 
valor: false
  • explain better what you need please, and put the code you’re trying to do

  • I need an SQL select to return all repeated records and a delete in order to delete repeated records with a date less than 01/11/2017.

  • Do you have a primary key for this table ? Or just the 3 columns cited?

  • In the database, the dates are like, date?

  • Are the kind datetime

2 answers

1

I made the following query:

  delete t.* from configuracoes t
     inner join (select 
c.nome,
c.valor,
c.date_add,
c.date_upd
from configuracoes c
where c.date_upd = (select max(x.date_upd) from configuracoes x where x.nome = c.nome)) x 
        where x.nome = t.nome and x.date_upd != t.date_upd;

Sqlfiddle did not allow you to put the DML command on the execution side, but it worked perfectly on the schema side.

Upshot:

Current Data from the table:

Dados atuais da tabela

After the Delete execution:

Após a execução do delete

Important: As you mentioned, it is a table of settings and should only have one row for each configuration, I did not delete with the condition of the date being less than 01/11/2017 and yes, delete all that are duplicated, and keep only the latest.

Sqlfiddle Without the delete command: http://sqlfiddle.com/#! 9/e7c474/2

Sqlfiddle After delete command: http://sqlfiddle.com/#! 9/16a37/1

  • 1

    I did not understand the logic of SQL (lack of knowledge my probably srsrs), but the advice to go through the last ID made the way I used it right

0


I found the answer

Let’s assume, all records where there are records with more than 2 repetitions:

SELECT 
    nome, valor, dt_add, dt_update, COUNT(*) 
FROM 
    configuracoes 
GROUP BY 
    nome
HAVING 
    COUNT(*) > 2 
ORDER BY 
    dt_update DESC

To view records sorted by ID

SELECT 
    nome, valor, dt_add, dt_update 
FROM 
    configuracoes 
WHERE 
    nome = "habilitar-alertas"
 ORDER BY 
    id_configuracoes DESC

Delete with lower ID than current:

DELETE FROM 
    configuracoes 
WHERE 
    nome = "habilitar-alertas" and id_configuracoes < "id_mais_atual" ;
  • 1

    I’m already posting my tb, dps take a look

  • this code will erase all settings with lower date than specified, this will go wrong...rs

  • 1

    I decided to follow your advice and not use dates as reference rsrs

Browser other questions tagged

You are not signed in. Login or sign up in order to post.