Physical Exclusion vs Logical Exclusion

Asked

Viewed 6,123 times

12

  • Performing physical deletion in most cases is not recommended, because you will not be able to reactivate such registration. Which is one less feature in your application/system, and there is no reason to physically delete it.

  • @Gabe, curiosity: how did you find the original source?

  • 4

    @bfavaretto I was told. Nothing surpasses the power of the masses :D

5 answers

11


Generally, use the physical exclusion only when you know for sure that you will no longer need the record in the table.

Use the logical exclusion when it may be possible that you:

  • can need to restore the record at some point (EU)

  • get information from deleted logs (multiple times client requests and not user’s).

The decision to use one or the other depends individually on each case. But in most cases the most recommended is to use logical deletion when there is the least probability of restoring deleted records.
A query WHERE tabela.Excluido = 0 does not usually denigrate performance, but concerns more about the size of data to store. And of course, about the possibility to restore.


An alternative also used, in tables that may have a giant size with a high proportion of excluded, is to move the records to another unique table for Excluded so as not to affect the performance of the main table.

But in that case, care must be taken because it may be possible that they cannot be restored easily if they have a PK self-adjusting. They are usually used for "historical queries".

This case can also be used in legal matters, such as storing data about the logins made by users on a site during X years in contrast to the data only informative to the user/client of his last login on a site.

10

Advantages of using a logical deletion:

  • Audits: If you use a date-time field to store the deletion, instead of a simple boolean / string, you can know when it was deleted. And you can use indexes and searches without problems with this, because your research will be of the type WHERE dtExclusao = NULL or something like that.

  • Easier deletion: no need to worry about maintaining some integrity when making the deletion, as there is no cascading deletion.

  • possibility of in the future implementing some intelligence/BI solution, since there will be a history of all the data that can be used

Disadvantages:

  • you will have to control the deletion in your code, always remembering to have a WHERE so you can leave the deleted ones out

  • If the table is too large and fragmented (with some not-deleted data well spaced in the middle of several deleted data), performance may drop as the tables are saved in "pages" on the disk, with some being loaded in memory. If the data not deleted is all in a few "pages", great. But if your index says you have valid data on each page, your bank will waste a huge amount of time reading each page to memory, drawing only the line you need, and loading the next page. In this case, to get around this, it would be worthwhile to have an active data table and play the ones that are deleted to another table.

  • the data of the cascading deletion continue all in the same tables, that is, a record is marked as deleted in the parent table, but in the daughter tables nothing indicates whether the data is still being used or not, and the tables can get large, without being able to play dice for a special table of those that have been erased

  • time / backup cost: since the full backups (full) will have a lot of data that were actually deleted

  • delay to write data: if the indexes are large (since the tables can be large, storing all deleted records), each recording of a new data may require a large index build time, and so each record action can have a low performance

8

Well, the advantage or disadvantage of the two types of exclusion depends very much on your system’s needs.

Before deciding what is the best practice for your case, I recommend paying attention to the following aspects:

  • Set a flag to indicate if the record is active or does not imply having to perform checks (keys WHERE) in each query from this point forward;
  • Performance can rather be a major problem caused by opting for logical deletion, so it is necessary to take into account the complexity of your database and the relationships between entities in order to verify whether this would really be a hindrance or not;
  • It is essential to perform an analysis of the frequency with which certain previously excluded data is requested/searched. Through this, you will know if the best thing to do is to use a bit field, create an additional table that will store certain information or simply add fields that contain date of modification, change and/or deletion for a wider control.

If necessary keep records inactive, go ahead and use backups to save your data, implement triggers divert and save resources as needed. In short, giving permissions to users who really should have the power to delete, discarding what prevents optimization of their system and promote good machine-user interactivity is always the best alternative.

6

There are two main purposes:

The first is that you do not lose the information definitely, it remains somewhere for audit purposes for example (there are better solutions for this, such as keeping special history records).

Second, it is maintaining relationships. For example, let’s see the typical case that is disabling users of a system: While active, this user performed operations, registrations, etc. Each of these entities is linked to the responsible user through fields of type "id_usuario", "created_by", etc, all pointing to that row of the user table. If you deleted this user using a DELETE, all these records would be broken, pointing to a user who does not exist. At this time it is convenient to just inactivate the user, so he can no longer enter the system, and at the same time you can consult the records left by him that will be pointing to the correct user.

If it’s a common practice:

In some organizations this is mandatory, it is simply forbidden to DELETE records in the base. Even when you wouldn’t need to. So the ideal is to really evaluate each situation and see what the best solution is. In many cases logical deletion really is useful, and in others you will have to use it even if it is not :-)

3

Avoid logical deletion so as not to get useless and irrelevant data in your database.

Use logical deletion only when necessary! It is a common and safe practice, as the application code takes care of the case correctly (the Laravel framework, for example, takes care of logical exclusions with extreme ease for the developer).

The question that remains is: when it would be necessary to use logical exclusion?

Through logical deletion, you can recover "deleted" data, i.e., you preserve history, and maintain integrity in the database.

An example: a product in a virtual store. To the extent that there are orders made containing this product, it will be convenient to make a logical exclusion of the product. Thus, it is possible to identify the items in past orders, even if the product can be "excluded". That is: with the logical deletion of the product, you preserve the data of these orders.

There are many other cases where logical exclusion is useful, important and valuable. But if it is not useful or necessary, it is best to do the physical deletion even, not to leave the database polluted.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.