Is this kind of normalization feasible?

Asked

Viewed 150 times

0

É viável fazer esse tipo de normalizaçao?

From here you will have more tables where the person will classify as, owner, customer, supplier, among others. My question is, by separating one table for Cpf and another for cnpj from the person table, is it good? In my mind, with this normalization there will be no blank fields of Cpf and no blank fields of cnpj.

  • 2

    I may be mistaken, but unless you link other information in the table of Cpf, or cnpj, which would be more convenient, type information of the establishment, this "normalization" is unnecessary...

  • What would be feasible? Do you have some parameter to establish this? Although short already has strange thing. This thing of having a table for each type of entity is such an old thing...

2 answers

1

I suggest something modular, avoiding linking the structure to localized business rules.

As an example, I would not create 2 tables, one for CPF and one for CNPJ, even if it was a localized system. I suggest something generic:

person
    id (unique)
    date_created
    date_updated
    other_column1
    other_column2

person_document
    id (unique)
    date_created
    date_updated
    person_id (relacionado com person.id)
    document_id (relacionado com document.id)
    data char(50)

document
    id (unique)
    date_created
    date_updated
    country_id (relacionado com country.id)
    title (nome do documento)

country
    id (unique)
    date_created
    date_updated
    title (nome do país)

Many programmers think that when there are many tables the system is more complex, complicated, heavy, etc. But this is not good arguments because it does not influence performance in a relevant way. To understand, follows the milk below.

When you are making a small system in which you are sure that you will never use for something bigger and mainly international, it doesn’t make much sense to build this whole structure. But anyway it is a structure that works well in a small and local project as in a large or internationalized project. So, why not normalize within a flexible pattern that serves both cases instead of creating something "plastered" that will not be reusable?

This is one of the reasons for standardisation.

Better explaining the structure above.

We have the people table (person). In this table there is no information about what documents the person has. In the table Documents, is where the documents are registered, however, not yet linked with the table of person. In this table, all documents required by the business model will be registered.

The important thing here is not to link the structure to the business model, that is, avoid it being something localized.

In this table of documents, the various types of documents will be registered. CPF, CNPJ, RG, Driver’s License, Passport, etc.

The table person_document is where the relationships are made. In this table, only the serial number or code of the document will be registered. What identifies which document is the column document.id.

The column data is the type char(50) because not all documents are numerical. There are many types of documents that include alphanumeric characters. Hence the CHAR type is more suitable.

It is also important to note that the Documents table is related to the country table. This makes it possible to use the system for a business model where it is required to know the country of origin of the document independent of the person’s nationality.

That’s just a summary. Of course it is possible to improve, make some adjustments, etc. But it does not mean that it should do so and it does not claim to be the definitive solution.

With this generic structure, the system becomes more flexible and portable. You can use the same system for use in any country without having to modify the structure. Otherwise, if you do something linked to local rules, for example, with columns or tables specific to CPF, these tables or columns will be useless for other countries and often useless even within Brazil for cases of underage registration, immigrants, tourists, refugees, undocumented persons, temporary residents, etc.

It may seem a little strange to the table country related to document, because it could relate country with person, right? The problem in this is to plaster the system by disregarding people with multiple nationalities or even if you don’t even have dual nationality, you can have documents from other countries.

Note also that in all tables of the example I added the columns date_created and date_updated. Many systems follow this by default. Always have a column that tells the date and time it was registered and updated.
This is a subject that deviates a little from the subject, as it is related to activity log and optimization. If you prefer, you can ignore.

In the original table of the question tb_pessoa, there is a column pessoa_tipo. This is also a way to "plaster" the structure. For something more flexible, the ideal would be another table to define the person’s type. I preferred not to comment because it would become something repetitive by following the same logic presented in the proposed structure. And so it goes to any other columns or tables that can make the system (plastered) localized.

Do you really need so much? Even in a small project?

Here we come back to talk about whether this is really necessary. If you just want to build a box control system of a bar of the street corner on the outskirts of a small town in the countryside that has no more than 5,000 inhabitants, why complicate with all this?
Because I repeat again what was commented on some paragraphs above that, this structure serves both for a small project and for a large project. There may be a difference in performance but it is irrelevant. After all, in a small project there will not be a large flow. Therefore, the concern with performance is irrelevant. In a larger project will require much more than that to get a good performance.

The key here is portability. You can use the same structure for diverse business models.



Obs: The above examples are purely didactic.


About nomenclature, see this link: Table and column nomenclature

0

Good as I can not comment, I would do everything on the same table and still make a Unique to ensure uniqueness, better maintenance, an example of my:

customer{
  id integer
  document varchar,
  document_type varchar,
  ...
  unique(document, document_type)
}

Also, this way I am ensuring that if at some other time my client passes accept some other document who knows why, I do not need to create another table, one more things for headache.

Well this is a preference of mine, but goes as a hint: Put prefix in the field within a table, if it is tb_pessoa, doesn’t have to be pessoa_id, only id is enough, because if any time is going to make an Inner that needs joining tb_pessoa.pessoa_id already facilitates tb_pessoa.id and I don’t need to tell you either tb_.

Got out a little bit, but I hope I helped you with the main thing.

  • I in case would prefer to keep id_nome_da_tabela. When doing several inner join, may even be more expensive, but it is easy to know from which table comes the id, and consequently, fk, and in future reverse engineering it is easy to maintain

  • Well I at least wear something like that customers.id gives me the same vision and organization quoted, but of course, this is very much a matter of taste, even I will put an addendum in my answer.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.