How to choose between Nosql and SQL?

Asked

Viewed 5,885 times

47

Nosql databases are there, and one question I always have when starting a project is what criteria to choose between a relational database or not.

How to evaluate my project to know the best option for it?

1 answer

45


You can’t respond magically, and the answer will always be generic. It’s just about lifting the requirements, understanding all the candidate technologies in depth to know what it really solves, its deficiencies and projecting what is intended on top of what it will use to assess whether it meets what it needs. In many cases you would have to make prototypes to "see" if you meet what you want. You need experience to not miss.

If the person does this very likely they will decide to use SQL and not Nosql. I am talking about statistics, not my preference.

Note that Nosql is a little too general. It has several types and each one may be more suitable than the other. In general it ranks among:

  • key and value

  • family of columns

  • document

  • graphs

    and some others, possibly associated with these, for example the object-oriented one that ends up being a specialization of the graph.

Characteristics of one and the other

Consistency and durability

The main feature of Nosql databases is that consistency is usually left out. Can your application give up consistency? Facebook can. It doesn’t matter if everyone sees a post at the same time, it doesn’t matter if the like is late. But to download the stock or change the bank balance can be like this? With Nosql forget transactions, atomicity (usually). Even when some products say they can do it, there’s always a little but hidden there.

In fact, the correct name for this type of technology should be Noconsistency because that’s what it doesn’t really have. Increasingly they are using relations and even SQL.

Many Nosql databases even have durability and need an auxiliary means to persist the data. Can you handle this? It’s suitable for what you need.

You realize that choosing a product is what you should do. Nosql is too broad. Relational databases have the advantage of being more standardized and it is easier to compare them.

Structured data

Don’t your data have a common structure? Nosql usually works best when the data is relatively free, that is, each input usually has different "fields". Precisely because this is important in some specific scenarios that relational databases are only giving an alternative when a part of the data is not structured. That was the main selling point of Nosql technology, it was so good that the feature was incorporated into all most used RDBMS, though some not so good.

Is your application(s) (s) capable(s) of handling unstructured, non-standard data well? It’s not the same as using relational data. In general it facilitates reading but can create huge problems for recording data. In addition to database consistency being occasional, it is more complicated for the programmer to keep everything consistent. Unstructured data is difficult to automate.

In many cases Nosql is usually just a data repository, it leaves it to the programmer to deal with what is there. This can be good or bad. SQL is usually a complete data manipulation tool.

Data relationship

I see a lot of people underestimating the need for relationships. They start doing something with Nosql, then the needs increase and the subject starts to have to make adaptations that move away from the adopted model, creating difficulties, bringing harm to the performance and what was good in the beginning becomes a nightmare. I’ve seen case that at a certain point the model changed so much that it ended up turning relational in a terrible tool to deal with relational.

Will you need few reports or structured queries? The relational model provides better tools for this. It has certain data access patterns that turn Nosql into a wagon. Imagine comparing 30 searches on each data line to build a report and a search. Multiply by millions of lines. Nosql may require a field search while SQL only requires one line search.

Performance

There is much talk of good performance with Nosql, that is not so true, after all there is no miracle. Today relational databases have good strategies for optimization. Nosql generally achieves performance with a lot of memory. This feature helps a lot the relational DB too.

Some databases, such as Mongodb, allow you to choose whether you want more performance or have other features, such as durability, for example. But then the performance is similar to that found in SQL or other more structured database forms. With little memory, forget Nosql for almost all scenarios. A lot of it doesn’t even work if you don’t have enough memory for the entire database.

I see a lot of people talking that he avoids JOIN. The relational also if you model this way. People don’t model this way because there are drawbacks to doing this. Disadvantages do not disappear in Nosql. Of course in the past SQL-based Dbs had fewer tools to facilitate a model without Joins, but that has changed. Nosql, at least in certain models, is good when the data is usually in a single document and they are always accessed by the document as an entry point, which is something inflexible, or when you need to do so many accesses to avoid the JOIN that he was preferable.

Auxiliary tool

It is common for Nosql to be used as an auxiliary tool in most applications. It is often just a cache. You have enough volume to need this cache?

This site you’re using now uses Redis. Are you making a website that will have similar access volume and that a large part of the accesses will use repeated data for a reasonable time? Do you need this level of scale? Do you need to distribute data across multiple servers to achieve scale objectives? Nosql used to be good at this. I’m going against a secret. They used a lot more cache than they use today, they found that the cache most messed up that helped.

This is the main advantage nowadays: scale the solution at internet levels (even when it is not an internet application. And note that the fact of being on the internet does not mean anything, the simple site that has restricted access, of a real estate for example, does not need this).

More detailed information

In fact much has been answered before in What is a Nosql Bank? How it works? and Nosql is as problematic as it seems?.

Has a question with an example of appropriate use (or not).

Read one of the best articles on the subject. A summary of Nosql usage scenarios listed on it:

  • Manipulation of streams of log or other continuously updated data;
  • data synchronization offine;
  • low latency needs such as games;
  • online games, especially the massive ones can benefit more from the schemaless;
  • voting system (Likes) and access meters;
  • priority queues;
  • session data;
  • data analysis in specific scenarios (big data);
  • high motion websites with quite repetitive parts;
  • where it needs graph relationships, such as recommendation applications;
  • some real-time financial market applications.

I purposely avoided relating there what is more abstract, has already been put before, or that modern SQL allows doing the same (example is keeping descriptive data of very diverse products). Mostly I avoided some things that are biased or false (at least now, may have been true in 2009).

Mongodb

Mongodb specifically quoted in the question has a page with indications of use. To tell the truth the page doesn’t help much, it seems that they understood that it is better not to give details because if it goes deep, 99% of the people will give it up. Not that the product is bad, just not necessary in most scenarios. It certainly has its uses, but I’ve seen too unsuspecting to go through horrific experiences with the product (it was before the 4.0, but most problems remain). And the fault is whoever chose the wrong technology, the product is good for his scenarios.

Personal experience

I see a huge amount of articles like Nosql better solve certain problems. Some are true, but in most cases: or the person does not know how to use SQL (actually the relational model) correctly, including due to lack of knowledge of the resources of the databases, especially the most modern ones; or you’re acting in bad faith to sell your favorite technology.

I will repeat what I said in another reply: it is disproportionate what is said about Nosql and the real need. How much of what we see on the internet and media in general.

Como visto na internet

There’s an expression that we use a lot in computing that’s very true, unlike some that don’t fit at all: the best tool is the one you know. If you’re going to use another, master it first. Don’t use it because it’s fashionable. If the current one dominates and solves your problems, keep it. If you do right on one and wrong on the other, the one you do right will work better than the other, in most cases.

Personally I like to leave SQL aside, I have worked on something that will possibly be my preferred database and it is a in SQL, will be flexible about the model, taking care of a little of everything, but will prioritize the relational model. Performance is promising in all cases and requires little Scale up. The one thing I know he’ll just be scalable enough, and it won’t be so easy to do Scale out.

It is very common for Nosql to give an initial illusion that it is easier to manipulate. This is because it is simpler, less powerful, an hour comes to the count. If you don’t have a solid foundation you may never realize the hole you’ve gone through.

  • 1

    Very interesting answer, made me review several points that I had not yet taken into consideration. Regarding the data structure, we could put that Mongoose meets this need well in Mongodb, but it is still an ODM, so it cannot be considered a feature of the database itself.

  • 2

    +1 I asked a question recently about this, but that answer helped me too. I’m trying to understand a little more about Mongo DB and Amazon’s Dynamodb. The latter seemed to me much easier and also cheap, compared to Mongodb, besides having in theory more resources, but it is not possible to know how useful such resources are.

  • Clearly biased post in favor of SQL. As for the argument that at the beginning Nosql databases give the illusion of being better, it gives both Nosql and SQL. All empty databases are good.

  • 1

    @Alves is an opinion. Several very experienced people read the same text and found it different. It would be good to evaluate if you’re not seeing it trend. Your comment makes it seem that the text only speaks ill of Nosql, which is not true and everyone can see. The text shows that SQL tends to be a better solution for most scenarios, but not all, and cites where Nosql can be well used. For years there have been several people here agreeing to this. If you are based you can show the opposite in a response. And please don’t change my words so I can argue, that’s biased and dishonest.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.