Depends on scale or volume
It all depends on the scale of the product. When the scale increases, many design decisions are counter-intuitive.
By scale I don’t just mean the volume of data. And scale and volume depend on context.
From a performance standpoint, a reasonable volume would be perhaps in the millions of records. I have worked on systems that processed hundreds of thousands of records without any difficulty, obviously with the proper optimizations. If each customer has their millions of records, then putting everything on the same table can be a problem.
However, from a business point of view, it depends on who the customers will be, how much the product intends to be evolved, the procedures that will be adopted for development and maintenance, having the data in a database can be totally desired or totally impossible.
The plan is to sell the system to a few companies and then the demand of each company for maintenance and evolution is expected to be great? Or it would be selling the system as a Saas to a mass of small businesses without them having more advanced access (think of the Free Market, for example)?
Problems with a shared database
Let’s think of some scenarios with a single bank:
- You need to backup, all customers get slow because of this
- A client unintentionally deletes data and requests the restoration of a backup (I didn’t want to be in the shoes of the person responsible for it)
- Application errors cause one client’s data to appear to another, or worse, data is updated without a proper filter on
WHERE
.
- You need to update the application and the new version changes the data structure. You necessarily need to update the system for all customers, all or nothing. It is common practice for serious companies to release first updates to a subset of customers and then, if all goes well, update others. That would be impossible.
- Your company sells the system to different people around the world. From a moment on, it is decided that in order to improve performance in other countries or to adapt the current legislation, the system has to host the data in that country. If the database were separated, it would be enough to move the database to a virtual machine in a data center of that country and also to place an application instance there. But now you will have to extract data from a shared database and put into another void.
- If there’s too much use for one customer, it can get in the way of the others. For example, a large client runs a routine that imports a million records. Other customers who want to read/write on the same table will be affected.
- Generating ids and keys will be more complex and/or less intuitive, since two records with consecutive ids may belong to completely different clients.
- All system queries and all database access methods will need an extra parameter. The domain objects and virtually the entire system would be polluted by tenant id back and forth.
Advantages of separate databases
- Doing and restoring backups is trivial
- Support does not go crazy to investigate a data loss/corruption problem
- Different clients may be in different versions of the system
- System instances can be spread around the world and moved as needed
- Heavy usage, deadlocks, and other bank-related issues that occur in one customer do not affect the others
Difficulties in maintaining several banks
The big problem of a bank for each client is to maintain the strict organization of the changes in the database properly related to each version of the system.
Languages like PHP where it’s easy you simply overwrite a file in production should be extra careful.
First, use a migration library so your system has the ability to always update the database automatically and ensure that the structure is consistent with that version. This may give a little more work at first, but it pays (very well, by the way) over time, especially in projects that evolve constantly.
Second, always have strong control of system versions related to your code repository. For example, each version of the system that goes into production must have a corresponding tag in Git. So when any problem occurs you can easily join the database and that specific version and play any errors.
Give more details. Example, if each company is an independent store, I imagine that each store will have different products. These products are accessible by a single shopping mall or are accessible independently?
– Daniel Omine
Yes, each customer would be an independent store, with different products, without the possibility of relationship with each other, all independently. The intention is to have several small customers, preferably if all goes well, thousands of customers.
– JamesOrtiz