Table normalization for 2nd Normal Form

Asked

Viewed 951 times

1

A relation is in 2FN if and only if it is in 1FN and not contains partial dependencies.

From this definition I am normalizing a BD to 2FN. This BD has 2 tables, which illustrates the figure with their respective attributes:

Estado atual das tabelas

To candidate key cpf table cliente, is causing partial dependency on this format, right? Because only id already the reference.

Rearranging cliente according to 2FN I ended up defining it like this, the central idea of my normalization thinking is correct?

Is there a better way so I can normalize it?

Novo modelo

  • "Would like to"... ?

  • Related: https://answall.com/q/151323/64969

  • Whoa, I just finished editing the ending. So I’m wondering if Cpf leaves the table disrespecting transitivity, because despite seeing it this way, some of my team friends have opinions contrary to mine. From this I would like to understand then if I am wrong and who is wrong and why. Thanks for sharing the link, it helps as an additional study material, but I need help in this specific case.

3 answers

4

Normalization exists essentially to solve redundancies. See some redundancy there?

Address

None shown. Is it possible for the customer to have more than one address? Is it possible for more than one customer to have the same address? If you can, maybe makes sense to apply the normalization in this case. With only one address it makes no sense to make this separation. Neither the second nor any normal form applies.

Note that in the second example you created a table called Cliente Endereço. Why are you putting an agency address on it? It doesn’t make sense.

Let’s assume that in fact it is an address table in general. Then it would even fit to put the address there. But for that?

Nothing prevents you from having a table only for addresses, but if there is no repetition of the data you are not doing this by normalization. This makes sense if entities may have more than one address or more than one of them have the same address.

I did not enter the normalization of the address table because it does not seem to be the focus of the question and in the presented form may even be normalized, has no data indicating what are the columns, not even the types, could well be ids of standard data. It may also not be that this.

Agency

A table has been created to separate the agency it belongs to. Again, do you have more than one agency that the client might have? It’s the same question as the address. What do I get? What problem do you think you solved? Normalization needs to solve problems, not cause new ones. I saw no advantage.

Best model

Even these cases can be questioned in modern databases. It does not cost so much to maintain space for more than one address in the entity itself. It is not always a problem to have a very possible repetition of address registration when two entities are at the same address. It’s a matter of pragmatism.

If you can have these cases, strictly speaking, you should normalize, but an experienced developer will analyze how much the effort is worth because it complicates the model by making all the code and performance difficult. So you have to think if it pays off, if it’s so necessary.

On the other hand the whole model may be wrong. This idea of separating entities into customer, supplier, bank, etc. is wrong by nature. At least in the form presented.

Entities are natural and legal persons (separate). Customer, supplier, bank, carrier, seller, employee, etc. are roles that these persons exercise in this organization. These tables should only have data about the roles. The data of the person themselves should be in the tables of legal and physical persons. This is the right thing to do, but again it is pragmatically possible to do otherwise if it makes sense. But you need to choose another way because it’s the best way and not because it’s the only way you know how. Not all cases make sense to separate like this, but in general is the most correct.

I could point out other possible errors in this model. But it would be speculation because it is only right or wrong knowing all the requirements. The mistakes I see would be according to my experience, not with the real case, are could be right or not.

Normalization

To know how to normalize you need to know the objectives. To be the most correct formally, to be the fastest to develop, to be the easiest to maintain, to be the most performative, to be what the teacher taught or the boss ordered even if it’s not the best, or whatever. You can’t normalize blindly. How far to normalize and where to stop is something you’ll learn over time.

  • Miner, then I’ll redo. I had decided to normalize this way because I can not see the client CPF not creating partial dependency with numAgencia, you do not see this way? I even understood everything else that explained to me, but this relationship, (id, numAgencia > Cpf) is still complicated, and with (id > CPF), could help me understand this, thanks for the link

  • I can’t see a relationship between these two things. CPF is the client, agency has nothing to do with this.

3


Your tables Cliente and Agência are in the first normal form because no field is multivariate, so let’s focus on the second normal form.

There are two candidate keys in the table Cliente: id and cpf. Clearly from the id, we can obtain any other field and from the cpf also. Nor can we use just a part of the id or of cpf to that end.

Already on the table Agencia, the only candidate key is the id and the other fields are defined on the basis of id and not just part of id.

Meanwhile, the fields estado are determined by the fields cidade which in turn are determined by the fields rua on the table Cliente and cep on the table Agencia. That is, the fields cidade and estado are violations of the normal second way because they do not depend on the primary key, but on some other field. Your attempt to normalize does not correct this problem completely, but it is already a step in some direction.

The solution would be:

  • Create a table Estado with the fields codigo and nome.

    The codigo is the primary key, while the nome is another candidate key because we cannot have two states with the same name.

  • Create a table Cidade with the fields codigo_estado, codigo_cidade and nome.

    The codigo_cidade is the primary key, the codigo_estado is the foreign key to the table Estado.

    The fields codigo_estado and nome are also a candidate key because there can be two cities with the same name in the same state.

    The field nome alone is not a candidate key because there may be cities with the same name in different states (such as Cascavel, name of a city in Paraná and one in Ceará).

  • Create a table Logradouro with the fields cep, codigo_cidade, bairro and nome.

    cep is the primary key.

    codigo_cidade is the foreign key to the table Cidade.

    You would think that codigo_cidade and nome together were a candidate key, and in almost every city this would be true, but it is not always. In São Paulo, for example, there are three different streets called "Rua Piracicaba". Therefore, the candidate key is codigo_cidade, nome and bairro.

  • Create a table Endereco with the fields id, cep, numero_logradouro and complemento.

    id is the primary key.

    cep is the foreign key to the table Logradouro.

    Interestingly, the street number is not always numerical. For example: "Rua João da Silva, 148-B, back house".

  • In the tables Agencia and Cliente, put a foreign key to Endereco and remove other address fields.

This model is not yet perfect, because the same street can have different zip codes and maybe it was the case to have a table of neighborhoods and also a table of types of streets (avenue, street, square, mall, lane, highway, block, etc.). However, for your purpose, that should be enough.

  • Wouldn’t that generate an excessive amount of tables? As I am not in the market yet, this would be a correct procedure to normalize and implement?

  • 2

    @Gustavodossantossorati The normalization process always generates an inflation in the number of tables. And if your exercise asks you to normalize this, you will have to perform the process as many times as necessary to eliminate functional dependencies, and at each step you will have at least one more table. Thus, you can get used to it, see a large number of new tables emerge from the normalization is common.

2

Based on the book conversion examples, below is an example of 1FN to 2FN and 2FN to 3FN.

inserir a descrição da imagem aqui

Based on the drawing of the book, I made a drawing to try to help in your question.

inserir a descrição da imagem aqui

As your question was limited to converting only up to 2FN, I did not convert from 2FN to 3FN as shown in the book example.

Consider primary keys to words with full underscore , EX: (CPF) and the foreign keys as (_CEP) with the underline before the word.

I added some non-key attributes in the relationship table CLIENTE_AGENCIA only to make the illustration more coherent, since you did not cite any relevant attribute to compose this table.

Source: Database Systems 6th edition, authors: Elmasri, Ramez Navathe, Shamkant B. Year: 2011

Browser other questions tagged

You are not signed in. Login or sign up in order to post.