Multiplexing of "foreign key"
Each form has its advantages and disadvantages, obviously. Decide which is the best I believe you should do in your application, provided you have knowledge basis to be able to make this choice decision. I’m presenting a recent case I passed at the company.
I work in a sales system, one of our features is that the seller has a current account. Thus, if the seller wants to give a discount more than the limit of a given product, he consumes from that current account and closes the sale.
A request like this may be pending, then enters the figure of the supervisor. The supervisor roughly approves or disapproves requests made by sellers. Also, if the supervisor wants to ease a seller’s life to close a last-minute purchase, he can make a transfer from his current account to the seller’s current account.
Both supervisor and seller are system users.
That said, the modeling is very similar image of Generalization/specialization that you put in. I have the table USUARIO
, CONTACORRENTE
(yes, it was all the original name of that table) with non-zero foreign key for USUARIO
, VENDEDOR
with non-zero foreign key to USUARIO
and SUPERVISOR
with non-zero foreign key to USUARIO
. SUPERVISOR
and VENDEDOR
also have their own codes that have no relation to the user code.
Well, everything was quiet until the following customization came:
The buyer (in our system called CLIENTE
) may also have a current account. In compensation cannot be required that this client has a user
My first reaction was to propose the creation of artificial users to keep this relationship intact. Can I just say she wasn’t accepted at all? It had to do with business requirements because we sold user licenses to use the system and this idea of mine would mess up this control.
Okay, next alternative? Multiplex foreign key.
Yes, bizarre, but practical. It solved our modeling problems. We proposed a migration of such luck that operations made in the old CONTACORRENTE
(now transformed into view) affect only the image migrated in the new table CONTA_CORRENTE
(yes, we are creative :+1:).
It is worth saying that we made a mistake in Trigger of migration that would do this magic? We forgot to test batch inserts, which generates crazy inconsistencies... but obviously we corrected this "tested" error in production
Multiplexing works as follows:
- I have a column that will be the "logical foreign key" of some table;
- I have another column that simply indicates which table the first column points to.
The idea came from the wavelength-based multiplexing of Mpλs, where through the intrinsic properties of fibre data transmission the system knows to which destination to direct.
The parallel between light waves and table lines then stood:
- both have multiplexer attributes (wavelength/multiplexer column);
- both carry data that differs from the differentiating attribute (amplitude and phase of the wave/the other columns of the tuple).
In general, the multiplexing I used to represent this relationship can be understood through the following scheme:
The triangle is not a real entity, it is only the "point" where multiplexing occurs.
In general, we work with databases of occasional integrity. It makes several operations easier on SQL-Server and we have the advantage that we may have temporary inconsistent states (but not as what the Sqlite allows), so we don’t worry about trying FOREIGN KEY
and CHECK CONSTRAINTS
literals in the bank. We leave it to those who will populate the data to do it in a coherent way.
That was a solution real to a problem that happened. I won’t say that every problem of this kind you must follow this path. I will also not say to always run away of such a solution. Knowledge is power, use it wisely and sparingly.
Multiple foreign keys (aka mode 1 of the question)
This type of connection can cause problems, especially if it is deliberately chosen in the creation of the project to avoid certain types of restrictions to avoid overheads (1 and 2).
I have a pesquisa
that a vendedor
will apply to clientes
which meet certain conditions. In this case, the conditions may be:
- customers buying certain representations (for example, those who buy Herbalife)
- customers belonging to the particular route of the seller
- all customers of that seller
- question applicable to sellers under the supervision of a particular supervisor
- all customers of all sellers without exception
This can be modeled as follows:
APLICAÇÃO_PESQUISA:
- CD_PESQUISA
- CD_REPRESENTAÇÃO
- CD_SUPERVISOR
- CD_VENDEDOR
- CD_SUPERVISOR
In order to speed up the last case, a flag additional, one identifier of all, thus the final structure:
APLICAÇÃO_PESQUISA:
- CD_PESQUISA
- CD_REPRESENTAÇÃO
- CD_SUPERVISOR
- CD_VENDEDOR
- CD_SUPERVISOR
- ID_TODOS
For the sake of avoiding overheads, has no restriction where only one of these values may be non-zero.
Well, due to this "loose constraint" very dependent on the register (and, even worse, if this information is imported from an external system, we don’t have too much control over it). So, what happens to who’s going to consume that information?
At the beginning of the conversation, these restrictions are not formally documented anywhere, so the poor soul who will take the scheme and try to extract some sense from it will take into account that any subset can happen. Except ID_TODOS
marked as truth and someone not null, this can not.
So it turns out that the poor soul who made this consultation, out of ignorance, had to take into account:
- if there is the representation of that seller and the customer at the same time, but the customer is not served by that seller in that representation, the search application is valid for that seller and that customer?
- and if I am branded a salesman and a supervisor who is not part of your supervisor chain or your supervisor’s supervisors, I should treat as a kind of
OU
or the data is invalid?
Multiple foreign keys, the most specific
There are cases where you really want multiple foreign keys to connect or not to a given structure, by design purposeful. I won’t get too much into the merits here of the structure and its foreign keys.
Imagine you have 7 foreign keys. CE0
, CE1
, CE2
, CE3
, CE4
, CE5
and CE6
. There are 27-1=127 different ways this marriage can occur. So, how to decide which forms are more specific?
We can say that each row of this table has an associated "bitage". We can represent as a 7-bit number as follows:
- if the column
CE_i
is not null, the position bit i
vale 1
- if the column
CE_i
for nulla, the position bit i
vale 0
So, the "artifical column" bitagem
may present the value 0110111
(little endian), what it means:
0
for the column CE0
, therefore null
1
for the column CE1
, therefore not null
1
for the column CE2
, therefore not null
0
for the column CE3
, therefore null
1
for the column CE4
, therefore not null
1
for the column CE5
, therefore not null
1
for the column CE6
, therefore not null
For a value of 64+32+16+4+2 = 118
About that bitagem
, we can define an auxiliary table that associates to bitagem
to a prioridade
arbitrary, and can then make any desired ordering as the "most specific".
Although this formulation recalls taxonomy 1, as it allows making multiple connections at the same time, it does not fit into this category.
The answer was very good, so from what I understand you made the first form I presented has some justification for not having made the second form ? which individual or collective decisions led to choose to do so in the first way. I have seen in various forums people doing so but do not understand why it is better or more advantageous to first form in relation to second form
– Jonathas B. C.
I didn’t do it the first way. Its first way requires two separate columns, and I made the same column the link. The decision not to use their second form was that they were immiscible entities
– Jefferson Quesado
As I tried to make clear in the reply (perhaps it was not), this solution was extremely unorthodox. It is not a standard to apply everywhere. Know your tools and for each case use the most suitable
– Jefferson Quesado