Duplicate record in the database

Asked

Viewed 205 times

1

Good evening folks. I am in need of a help to remove some duplicate table records from my Mysql 5.7 database

By running the following query I can identify which records are duplicated:

SELECT source_id, COUNT(*) TOTAL 
FROM news 
GROUP BY source_id 
HAVING COUNT(*) > 1 

I wonder how I can do to delete duplicate records and leave only the original after the SELECT command above.

  • 1

    And how do you identify which of the records with duplicate source_id is the original? If it is not allowed to duplicate it is not better to put this field as primary key or as UNIQUE?

  • Use the SELECT DISTINCT

  • 1

    If the table has a sequential ID it can simply Join the table with itself: JOIN ON a.source_id = b.source_id AND id_sequencial != MIN( id_sequencial ) effectively filtering the "originals" with smaller id_unico. the returned id_unico will be the ones that should be deleted (test before, of course). If you don’t have ID id_unico, there’s a good chance you’ll have more serious problems than deletion. You can do everything in one operation, but test with a SELECT (subquery with Join) before switching to DELETE. Do not forget the most important, which is to fix the application so that the problem even happens.

1 answer

3


You will need to fulfill a series of steps.

  1. Create a temporary table named news_tmp
  2. Make a SELECT DISTINCT in the news table to take only one record and ignore duplicates. In select, put the columns except the primary key column.
  3. Then run a INSERT INTO thus:

    INSERT INTO news_temp (campoA, campoB)
      SELECT DISTINCT campoA, campoB
      FROM news;
    
  4. DROP TABLE news;

  5. RENAME TABLE news_temp TO news;

Unfortunately your primary key must be disregarded. If you need to maintain the value of the primary key, you will need to do an algorithm with another programming language.

  • 1

    It is an output, it can inspect the news_temp to see if everything worked out before you drop, but if it has primary key can simply make a repeat DELETE in an SQL operation only, and will still keep the primary key. See my comment on the question.

  • Did it well @Bacco! That’s it! + 1 for the review!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.