Is it really necessary to create an auxiliary 3rd table in N-N relationships?

Asked

Viewed 3,018 times

11

In several places I see that when you have a relationship N-N (if I remember correctly) it is recommended to create a 3rd auxiliary table and then later it will be transformed into a relationship 1 - 1 (also if I do not miss the memory).

  • Why do they say you need to do this? It’s about normalization?
  • Is it really necessary? What do you lose by doing this? And what do you gain?
  • This recommendation is only for Mysql?
  • 2

    Customer - Request, I believe it is not N-N, or an order will have more than one customer ?

  • 2
  • 1

    Your case of Cliente and Pedido be N:N is only in very specific cases where the system requires it. Already the tables Pedido and Produto, can have a relationship N:N creating table PedidoProdutos.

  • 1

    Without the table, how would you make that relationship?

  • 3

    Good examples are real, because invented things like that will have the answer that this is wrong. One of the advantages of relational is to be able to see data in several different ways simply and with performance. This auxiliary table would do it. If you don’t need to access the data like this, you don’t need it.

2 answers

15


'Cause they say you need to do it?

For winnings, see below.

It has to do with standardisation?

Yes.

It’s really necessary?

No, but almost. It’s very difficult to get it right without this table of association, and in some cases unworkable, although possible.

What gets lost doing this?

Need for a JOIN is the main

And what you get?

Query flexibility, performance, ease of maintenance, consistency, just to mention the main.

This recommendation is only for Mysql?

Not.

Details

A good example is X products suppliers. A supplier most likely provides various products and it is very common for a product to be supplied by multiple suppliers, mainly in wholesale and retail, but also in the industry when the product is commodity or has perfect replacement.

How can you link both? One way is to put in the product itself all the suppliers that can supply it. It may sound weird, but it often works because it’s usually too few suppliers. At least in tables with variable size this is not a big problem. Of course it has disadvantages. The access is not so simple, you may have to do certain contortions to get what you want, know who provides what, may have performance problems, even because it is very given together that it is not necessary to complicate the cache and the readings that will be more frequent.

It gets much worse on the other side. If putting all the products that a supplier provides can be a monster, it is common for suppliers to have thousands of products.

Another solution is to repeat registrations, that is to have a line for each product and supplier that provides it. This hurts the normalization, creates data duplication, can be an even worse situation in terms of performance, memory consumption, and juggling to get to where you want.

The solution that usually works better is to have a mooring table where basically it contains the relation of supplier and product. It is small, with the right indexes is fast access, does not harm the cache, does not load too much, does not duplicate, is not so difficult to query the data thus, allows access from both sides in a simple way in most cases.

Of course it loses a little the locality, it requires some JOIN in most queries, it is necessary to ensure that the data update is done correctly in this table as well, but not very different from other solutions, at least it is an operation within the relational normal standards that we need.

That goes for any relational database.

Think about how this works on objects in memory. You will have at the supplier a list of the products they provide. And the product will have a list with its suppliers. Are they part of the same object? Contrary to what many people think, it does not, the list is another object. Note that you will probably have two lists. You can do the same in the database, but I see no advantage, it takes work to solve so. It is better a table with only two indexes.

I’m somewhat critical of the use of non-relational databases because the immense majority of the problems we solve with Dbs are relational. When they start using nonrelational technologies to do so they bring problems not found when the technology is relational, then to solve they start to create other technologies and methodologies to fix the inadequate choice of technology.

It doesn’t fit here, but non-relational models tend to be useful in one part of the problem, rarely in the whole problem, so with the evolution of relational Dbs to work better with non-relational data becomes a perfect solution. Even problems that have this feature are often adapted to fit the model, often damaging the user experience, although it can simplify the development a little.

So if that’s the problem, you’ll almost always have to do so, unless the problem can be solved otherwise with peace of mind. Even if you don’t need this mooring table, maybe one day you will and then you will have to make profound changes to the database. Few applications are prepared to work with different physical database structures, which is even a conceptual error, but there I do not know how much pragmatism has to prevail even.

10

First I will quote, based on books, concepts about cardinalities N x N and then answer your questions:

According to the authors, when relations between two tables are N x N:

In cardinality N - N it is very difficult for the programmer to convert this semantics into code, because this, what is done is to create a new entity (Entity-Associative) that will relate to the original entities with the cardinality N - 1.

When the cardinality is N - N a new entity shall be created to represent the relationship [...] between the entities involved in the relationship.

Initially every relationship N - N have to be broken in relationship N - 1, one more table. This is because multiple relationships are impossible (N - N ) in real database.

Considerations

At no time in the books cited was explained, in fact, why we should use the creation of another entity in relationships N - N.

However, I could see that, all agreed that adopting the treatment of another entity reduces the complexity to implement logic, both when coding, and when saving, maintaining and extracting information from RELATIONAL SGBD.

Now about your questions:

Why do they say you need to do that? It’s about normalization? You don’t need to add another table in relationships N - N. In my opinion it is more a question of a better treatment in order to decrease the complexity when coding and working the data in SGBD’s relational.

Is it really necessary? What do you lose by doing this? And what do you gain? When it comes to creating the relationship model of a project I have as my focus/treatment or best practices, call as you like, create a system that meets the need in the simplest way possible so that maintenance is also as simple as possible.

About what you lose or earn. I believe, based on citations, to be the productivity when coding. I may be wrong, but relationships 1 - N ou N - 1 are less complex to treat. If I use the "convention" of creating an entity for relationships N - N that diminishes my complexity, I will be gaining in productivity.

This recommendation is only for Mysql? Not.

According to Morelli relational modeling (MER) was created to describe data stored in tables.

That is, this convention is for any relational DBMS.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.