Best practice for deleted data

Asked

Viewed 97 times

3

I am in doubt as to how best to deal with deleted data. In the system should appear to the client the option, "delete given", but the fact that deletion of this data would have to erase several other data that are related to it and would also affect another end of another client that makes relationship with this data.

What should I do? Keep a tag by placing the status of this data as "recycle bin" and show to the customer as if it had been deleted but keeping in the database?

  • I recently had some bad experiences because we didn’t have one Lixeira. I recommend including a column in the table that says whether or not the data can be displayed. But never delete, users and even ourselves programmers are subject to errors.

  • Do you want the same record to appear as deleted for one user, and as not deleted for another? Then it’s more complicated than simply having a deleted status.

  • @exact bfavaretto. Because it is a Marketplace, if a user excludes the other client who has already had a 'relationship' with him, will be without reference to the data.

2 answers

6

As general guidelines of good storage practices, it is always interesting to store the records when "erased" and create a flag for them making the deletion statement, however some questions must be answered:

  • How often this data is restored?
  • They have some other purpose (monitoring...)?
  • They can be partially transformed to another table?

In my own experience I prefer to keep this data, but for some cases I use temporary tables, which may help you. In some cases it really doesn’t make sense, for example when session storage is done.

As mentioned by you, deletion of this data would affect other relationships, but in this case note that if you offer deletion to the user this problem should be addressed in the modeling of your table.

Soon the final answer is: It depends.

Source

2


One technique I developed and it’s nothing new, is to move the data to another table.

Normally I duplicate almost all tables and for some I still create a third auxiliary table.

Example, a product option table

item_option
item_option_deleted
item_option_archived

In the item_option table, are the original data. When an delete action is performed, the user has option to move to recycle bin or rule out definitively.

When move to the bin, the data goes to item_option_deleted and when definitively excludes go to item_option_archived.

Everything in *_archived is not recoverable by common user interface.

That is, virtually for the user, there is a Recovery only for the data that are in *_deleted, but for the administrator access is released the tables *_archived also.

But it is not just accidental deletion or in case the user regrets it. As mentioned in the question, one complication is the existing relationships.

In a virtual store, for example, a customer buys a product with option A and B. After 1 week the store decides to delete these options permanently. So what happens to the history of these customers who bought with these options that are deleted?

Here comes the table support item_option_archived. In the customer’s purchase history you have the reference for options A and B but are not present in the table item_option. In this case, a search is made in the tables item_option_deleted and item_option_archived.

To avoid having almost three times as many tables in the database, I have tried to put these tables in another database but the management of a second base becomes more complex and in cases where the hosting provider does not allow more than 1 database, the system will obviously not work. So I preferred to simplify everything on the same basis.

Why not create a flag?

That’s a choice that depends on the case. I prefer to keep a pattern by moving the data to other tables as described above because the original table gets heavy with so many data with "deleted" or "definitely deleted" status. It is very common for a small store to delete products permanently and in a short time have a table with 50 thousand products, only 1200 are valid. The rest is all junk that’s already been excluded. This affects performance, search or simple SELECT. It’s obviously faster to search within 1,200 records than 50,000.

I emphasize that it is not wrong to use the flag technique because each case is a case. There are cases where it is more convenient to just create a flag.

  • it would not be cheaper to keep an Index than to consult two tables?

  • Query _Deleted and _archived auxiliary tables are low. Queries are more frequent in the original table. Considering cost-effectiveness and efficiency, it’s a better way than the flag technique, in most cases that I usually deal with.

  • Which database you use?

  • 1

    This is indifferent, but I use Mysql

  • 1

    Daniel, just a suggestion, like the item_option_deleted will be present in all your queries, you could make use of table partitioning, in Mysql is done as follows: https://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations.html. you would have the physical arrangement similar to your approach, but the application could completely abstract this.

  • Usually _Deleted and _archived are not present in queries. This is only necessary when you need to consult a relationship and it is non-existent in the main table. partitioning I know, but I always avoid using specific features of an SGDB because I may need to change SGDB and this is a certain work, especially if the other environment does not have similar feature.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.