4
I am modeling a Mysql database where the main entities are documents. Among several other fields, these documents relate to one or more phases of a particular project.
There are 4 basic values (let’s call them A, B, C and D), and the property value phase is a set of 1 to 4 of them. For example, a document may be related only to phase A, phase A and B, phase B, C and D, etc.
I see several ways to model this, but I’m in doubt about which one would be more appropriate. The options I’ve considered:
- 4 columns "boolean" (
TINYINT(1)
), one for each phase. - A single binary column with 4 bits (
BIT(4)
). - A column of the type set:
SET('A','B','C','D')
. - Relationship table.
I will need to get/display both the list of phases of a given document and the list of documents linked to a given phase.
What would be the most appropriate structure in this case? What criteria should I take into account to make this decision?
Thinking here: if you do not make a point of Foreign Keys, the table
conjunto
is unnecessary. I just don’t know if it’s a good idea, after all there are only 16 lines... Anyway, you can save a Join, comparingfc.id_conj
directly withd.fases
.– mgibsonbr
Interesting idea, but I don’t really have as many documents to justify the complexity. I really doubt that performance is a problem with any of the alternatives I put in the question. That’s why I was left with no parameter to choose one of them. I liked the pro-relationship table argument, to preserve normalization. The other options seem to me more economical (do not require additional table), but apart from SET (which I think only exists in Mysql), they are ugly and not very self-explanatory.
– bfavaretto
Yeah, my main parameter here wasn’t even efficiency, but simplifying darlings. My experience with comics in general is limited, and I imagine that the other options would complicate the queries a little. But I may be wrong. Let’s see if anyone with more dominance of that area responds. P.S. Migrating from a binary column (option 2) to a scheme like this would not be difficult, and vice versa, so that there is a certain flexibility if it is discovered in the future not to have made the right decision.
– mgibsonbr
I think deep down I want to know which option is more "elegant". Does my question is based on opinion?
– bfavaretto
I don’t know... for me, "elegant" in the context of a relational comic will always mean "normalized", so I would reject the binary column and maybe what it uses
set
(I have no experience, but a columnset
would not violate the first normal form? ). As for the others, my preference for the relationship table is because it "homogenizes" the darlings - while the boolean columns would require a different query for phase A, another for phase B, etc (i.e. each complex query involving a phase would have to be adjusted for each boolean column, you cannot use a single parameterized query).– mgibsonbr
I would need to study further, but from my quick research there are controversies as to whether or not sets violate 1NF. Anyway, I think you’re right, in terms of elegance the relationship table is standard, and it’s always a safe option.
– bfavaretto