Best way to keep data that depends on a condition

Asked

Viewed 241 times

8

Hypothetically I have a table publicação that has, by default, attributes autores (derives from a relationship table),titulo, edição, editora and ano. However, depending on the type of publication (such as livro, artigo em periódico, artigo em jornal and others) there would be need to collect additional data.

For example, if the publicação be the type livro would need to store quantidade de páginas and volume. However, a publicação may be generic and have no type.

My question is: how best to deal with this situation?

I thought of two possible ways:

    • Create a table publicação with the standard attributes.
    • Add to publicação a column tipo
    • Create a table for each possible publication type with its particular attributes
    • Relate these new tables to publicação and form a composite key

or

    • Create a table publicação and add all possible attributes of the types and do the treatment in the server-side application. (This does not seem to me a good solution)

I have little experience in databases. I believe there are other solutions. I would like to know which ones?

3 answers

7


There are absolutely no right solutions in software development. So the best way depends on a lot of things.

What can be said that the form considered the most correct is the first to involve the data normalisation. More correct doesn’t mean the best. There are situations you should do what is not so correct so to achieve the best result for the specific situation.

In general the less "optional" information you have, the better, but there can always be reasons to do this, eventually even by optimization.

Even defining what is best is already complicated. Better at what? What for? For whom? When? Even if it meets one criterion, it will not be able to meet others.

Alternatives exist. For example you can go deeper into normalization, go to 6a. normal form or the form of key and value pairs. I don’t recommend it but it’s a way to decouple the data.

The second form should only be adopted if you have real performance problems (if measured correctly), which I doubt is the case.

My only question is about using the composite key. Unless there’s something you didn’t report I don’t think you need. The id may be used as the primary key in the table of each specific type.

If the publication is generic, simply leave the column indicating the type without value, probably null.

  • Thanks for the remark. Actually, use the id publication as the primary key in the type tables seems a good alternative to this situation!

2

If you go to the second alternative the publication table will not scale (that is if you have too many records the performance to get specific information will be bad). In addition to that you have this "information treatment" in the application which can eventually also become a headache. No, it’s not a good solution.

The first alternative is the best and you’re actually talking about a common step in building databases, the Second Normal Form or 2FN. Data standardisation is a set of procedures that follow to achieve consistent storage and effective access to information. You have a lot of information online, starting with wikipedia, with everything well explained and examples. It will be good to understand these principles and then best realize your data model.

1

I believe that the first form would be the best, including thinking about scalability and future implementations that may need.

Also does not "swell" the table, resulting in optimized performance.

For cases where you need to put it all together, you can create a View in the database that returns the grouped information.

The first form also corresponds to the standard database normalization, the Second Normal Form, as already mentioned by Carnation.

Later when it is necessary to add other features of a type not foreseen today: Online publication, facebook post, for example, you will have a much easier way to do this, just add one more type and adjust the View.

I mean, the first option will make your life a lot easier in the future.

  • That’s it! Another question I have is: I can use some prefix to visually identify the tables of types? ex: pub_livro, pub_artigo etc. There is some convention for this case?

  • You can, no problem. Usually the table prefix is a choice of the developer himself, but the convention is to use 2 or 3 letters and underline it, exactly as you put "pub_<table name>".

Browser other questions tagged

You are not signed in. Login or sign up in order to post.