Data Modeling: Integrity x Performance

Asked

Viewed 299 times

4

In the company that I work for, there’s a data architecture that I’ve never seen before, and I’d like to know if this is common, or if it’s a new market trend. Just to mention the bank is Oracle. Here are some points:

  • There is almost no relationship between the tables, for example, in one of the tables there is a field that represents the id’s of another table, but this same field is not a foreign key, but an entire field, so any value can be saved there. The responsibility to ensure data integrity lies in the API that accesses this database. Soon when a POST arrives, for example, wanting to save in this field, which in my conception should be foreign key, it makes a select in this in the other table, to check if this value exists, and if yes, then it saves.

  • There are several databases, ie there is a database for each company subject, even if these tables of different banks could very well be related.

  • There is plenty of redundancy. At this point I even understand, because there is the question of 'denormalization' in pro-performance, but in this case, at least in my conception is bordering on extreme.

I questioned the person who did the modeling, and the main argument was about the performance. Therefore, all the responsibilities regarding data, integrity, business rule, are in the API. I have just started in the labour market, and I would like to know if this is a new market trend.

  • 1

    I’ve always been very concerned about performance, and I understand that if I do a good data architecture, use resources properly, use relationships between entities, I’ll always perform well, to be honest I can not understand how any redundancy could improve performance, even worse , not use the foreign keys that contains indexes and help a lot in queries. On second thought, I even understand data redundancy, because since there is no proper modeling, whoever designed this structure used this feature to improve performance.

  • However, if a suitable model had been adopted, redundancy would become unnecessary.

  • @Rodrigok. B The fact that you don’t understand doesn’t mean you don’t have advantages, it’s even a well-known fact.

  • 1

    @Maniero ok, maybe I expressed myself badly, but I’ve seen this kind of approach, of programmers using redundancy to decrease joins in their queries. What I mean is that if the data modeling is done well, it becomes unnecessary. Thinking about performance, I realized that good modeling is more efficient than redundancy. This type of approach can even help you with queries, but can cause problems in maintaining data for example. Soon I wonder, how efficient it is. Hence what I said "I can’t understand how it improves performance".

  • This is talking about speed and not any efficiency. I agree with the other aspects.

  • @Maniero thank you for your comments.

  • "responsibility to ensure data integrity is in the API" This is very wrong. DBMS does this and much more efficiently and safely. It has seen a lot of this type of approach and usually leads to a lot of maintenance and problems that could be avoided with the proper Fks and indexes

Show 2 more comments

2 answers

5

If I understand correctly that’s the sixth normal form or EAV (Entity Attribute Value). And is used in some cases. They are using more and more, and in some cases abused.

And it seems to have given up the ACID transactions.

It seems wrong to do this on Oracle. If it is to do so probably one of so-called Nosql are more appropriate. But only seeing the real situation to affirm.

Redundancy alone is no problem, but may be depending on the case. There are cases that is even mandatory.

In general what has been exposed should bring performance problems. Ask him to demonstrate that the performance is clearly and objectively better. I doubt it, except for the redundancy in some cases. If everything is based on this, it might be. It depends on the load of reading and writing, and the writing patterns.

There is much that people exaggerate in the use of databases. It doesn’t always need everything the database has, but yours also seems to be an exaggerated case in reverse. And I’m going to go so far as to say that the person followed a cake recipe wanting to make a pie. But I could be wrong. The description of the problem here may be wrong, it may be that the context demands it, anyway, just a speculation.

In fact it seems to be a trend, a bad one in passing. My perception is that this is necessary in a tiny amount of cases. Pragmatists do what they need to do and almost every problem fits beautifully well into the relational model, no exaggerations.

Business rule in the application (API is something else, it has no implementation) is quite common and desirable in the vast majority of scenarios. In some cases it is very expensive to put the business rule in the application.

  • Thank you very much. Regarding data integrity (relationship between tables), this impairs performance?

  • 1

    It depends on the way it’s done, but it usually hurts. But doing it in the app won’t help, it actually tends to hurt more. Unless you abandon her.

  • What about the creation of several databases, even if there are tables that could be related? When I questioned this, I was told that if there was a problem in one of the banks, it would not stop the other consultations. This argument is correct, this is considered a good practice?

  • 1

    This is the idea of microservices. If you stop one, it can compromise all consistency. Can you give up consistency? It becomes clear to me that they have done something that they do not know the consequences, unless it has a plausible justification or lacks information to know about the real fact.

2


From what you report, a database for every company subject, lots of stuff denormalized and rules in webapis. Probably in the company you work are using, or at least trying to use, an architecture of micro services (or microservices).

Micro services need to have a bank for each service because they are independent. The secret of the performance in this case is not related to having (or not) Foreign Keys, but to be able to break your system into several different servers, dividing all the processing.

Regarding your question regarding being a trend, I believe that micro services is more of an option than a trend, as an architecture of micro services is only justified if there is a lot, but a lot of simultaneous access demand. Example: a Amazon or Mercado Livre, because such an architecture becomes very complex and expensive.

For example, the Stackoverflow that receives millions of requests - according to a lecture I attended from a platform developer -, has a simple architecture, no more DDD and neither micro services but it’s super performatic. Therefore, you who are starting now in the area, take great care with fads.

I’ll give you a very good video to reflect on micro services

https://www.youtube.com/watch?v=ValESAojRSw

Browser other questions tagged

You are not signed in. Login or sign up in order to post.