SQL in the code or in the database?

Asked

Viewed 1,556 times

21

I’m having a big doubt about where I should put mine query SQL, whether in code or bank.

Currently I have a Procedure which is mounted according to the parameters I have, ie, filtros.

Procedure

@escopoVerificacao varchar(MAX) = NULL
SET @sql = 'SELECT OrdemOrigem, OrdemTipo, IDCal, FROM  NormasCALClientes AS NCC' 
IF @escopoVerificacao IS NOT NULL BEGIN
    SET @sql = @sql + ' inner join VerificacoesClientes on VerificacoesClientes.IDCalCliente = NCC.IDCalCliente 
                                    AND VerificacoesClientes.IDCal = NCC.IDCal ' END exec(SQL)

In the C#.

  StringBuilder query = new StringBuilder();
query.Append("SELECT OrdemOrigem, OrdemTipo, IDCal, FROM  NormasCALClientes AS NCC");
    if (!String.IsNullOrEmpty(escopoVerificacao.Text))
        query.Append("inner join VerificacoesClientes on VerificacoesClientes.IDCalCliente = NCC.IDCalCliente 
AND VerificacoesClientes.IDCal = NCC.IDCal");`

Doubts:

  • It is good practice to traffic the text created in C# to the database as in the second example? I have heard that it would not be a good practice.
  • On issues of Maintainability and readability, which of the examples is best?
  • Example 1 could affect the bank’s performance by executing the ifs?
  • Which to use?

3 answers

21


Take a test with both of them and come to the conclusion on your own in the conditions of what you will use. The code has less influence than the database only you have access to. And keep in mind that in the future with a slightly different basis from the current one can change.

Maybe you’ll take a test and you won’t be able to tell the difference. Then you will know that performance does not matter and you should choose to do what is easier to maintain. Performance concerns rarely make sense if you’re not seeing a big problem. And if the problem exists, the way is to check on your own if another way significantly helps.

It is possible that the first code will run slightly faster for some optimization of SQL Server but I wouldn’t count on a big difference, at least not in most cases.

There are controversies of which is more readable and easy to maintain. I think in the application code is always more readable but there are people who say otherwise. Which do you think you will get better to tinker with in the future? Is there any reason to put in the database?

Even those who prefer to put in the database admit that it is not so simple to maintain code in the database. So it’s all the more reason to only choose this option if you actually get a considerable gain, which is rarely the case.

See more on Software Engineering and in the DBA.SE. Note that if you are a programmer you will probably want to put the logic in the application and if you are DBA will try at all cost to put the logic in the database :)

  • So the conclusion varies from user to user? In my opinion the code is also easier to maintain. Answering the questions the first I think would be in the code, and the second would have no reason to leave it in the bank.

  • It varies according to the database. What may be good today may not be tomorrow (of course that tomorrow is not real tomorrow). But it can remain the same as well. It depends on the data pattern contained in it. It is not random, depending on the pattern you will have different statistics and the optimizer can work differently. I usually don’t put it in the database until I prove I’ll have leverage meaningful. Putting the code on DB usually brings an advantage but in most cases it is so little that it is not worth the trouble of having to maintain it.

  • I’m finding it easier gives maintenance in Querys by code @bigown, more in relation to performance I’m feeling no difference.

  • In almost all cases the difference is almost zero even. It’s not zero in fact but zero is too absolute a number :) If you’re not noticing difference it’s what matters. Speed just to say you used the fastest is never an advantage. Speed because it makes the use experience better, can give a real noticeable advantage is advantage.

8

Completion:

Prefer writing queries on the application side instead of writing in stored procedures.

Why:

The following are some factors that lean towards writing queries on the application side instead of writing on the database side.

Performance

For the database, there is no difference between executing queries in stored procedures or received from the client.

You will only be able to observe a minimal difference in performance between one and the other due to variations in the environment that are very difficult to control, and now stored procedures will be slightly faster and sometimes queries sent by the client will be slightly faster.

Where stored procedures can gain advantage is in the heavy processing of large data volumes, for two reasons: why you are closer to the data and do not need to traffic itthem over the network and because you can easily count on the use of the disk using physical and temporary tables so as not to have a huge amount of RAM.

So, if you do not intend to do in stored procedures heavy processing of large data volume, you can rule out the factor performance of its decision-making.

See, for example, the results of this measurement.

Complexity

Each language introduced in your project adds complexity. It is common for a project to use several languages (SQL, C#, ASP and Javascript, for example), but complexity is something to fight against and not to search for free. Each complexity needs to have a good reason to be added. What’s a good reason to add the stored language to the project?

In addition to the complexity of an additional language, there is the complexity of additional tools, such as code debugging. Besides the tools for C# are much more evolved than the tools for Transact-SQL, for example.

Project structure, dependencies, tracking, versioning, deploy...

All of these needs are more complex when it comes to database code than when it comes to standard application code. Luckily, table definitions vary much less than everything else (business rules, queries, UI...). So that using database code for queries or business rules increases the challenges of code management, distribution and application update.

Scalability

When there is a need to process large volumes of data and opt for stored procedures, another major problem usually arises: scaling the solution.

The demand for a lot of performance is usually endless - the volume of data only grows and yesterday’s solution no longer serves today, and database servers are recognized for not allowing horizontal scaling. You put more and more processors on the server but find it costs a lot more than adding a new server - it turns out that not all database servers support parallelism between physical machines (or "active cluster")and the ones that support are very expensive solutions, much more expensive than just buying another single server or allocating another machine to the cloud.

I have seen projects migrating processing to the database (stored procedures) and a few years later migrating back to the application side, and the two movements had the same demand: more performance.

Portability

The SQL language is quite standardized among the databases, while the language for writing stored Procedure is quite specific in each of them. So that it is easier to support more than one database by writing only SQL instead of also writing stored procedures. For "easier" understand less costly and less risky, since each query only needs to be written once while the stored Procedure needs to be rewritten for each database.

Of course this factor is only important if you want to support more than one database server, which is a fairly common need.

  • 2

    Very good answer also @Caffé.

1

Depends on the System

The answers that have already been presented are very good, but I will contribute other aspects that may not exist in your project, but may be relevant to others.

It is good practice to traffic the text created in C# to the bank as in the 2nd example? I have heard that it would not be a good practice.

This type of recommendation not to perform outside the bank may be linked to the fact that traditionally there are professionals with different functions in the development of a system. The professional working in the database can write more optimized queries while the developer traditionally does not specialize in databases.

There are aspects that are linked to system requirements that can better define whether the choice is on the one hand or the other. For example, if the system needs to run in multiple databases, it is more costly to have to write queries within the database using the language specific to each.

On the other hand, the procedures the bank may have facilities that you will not find in the language you are using for access. If there is a requirement that you need to ensure the security of executing queries, it may be more interesting to use the object security infrastructure of the database itself.

Nowadays, with systems running in the cloud, it doesn’t matter much to the user which database is being used and a project can easily be written to run in a specific database and thus benefit from all its features. I see decisions like that with banks NoSQL, for example.

On issues of Maintainability and readability, which of the examples is best?

In the maintenance and readability part it will depend more on how you design the system than whether the query is on the program or bank side.

In your example you are concatenating the SQL both on one side and the other. A more readable alternative would be to have two queries, each with its purpose and the program to call one or the other according to the parameter instead of the parameter to modify the SQL. Another way to make this type of merge cleaner would be to use a template engine at the top of the command SQL in the same way as with HTML, for example.

Example 1 could affect the bank’s performance by the fact that run ifs? Which to use?

If you are really judicious, the example of the query written in the bank, you have two runs. The first to interpret the command and mount the SQL and the second that actually executes the SQL. The second call is internal and her time is irrelevant. For the bank, the SQL is dynamic and it will not have a chance to get the query ready. You will have to run the preparation every time the command is executed.

The query being sent from the program, although it represents only one execution, is also dynamic and the bank will not have the chance to leave it prepared. Of course the optimization mechanisms and cache of current banks already do a good job in this scenario, but it is good to know what is happening.

Usually the execution times of queries in bank SQL are of an order of magnitude much higher than the call times between the program and the bank and so you need to worry about the purpose of that query. If for example, you need to make a call from the program and with the answer you need to make other calls, maybe it’s more interesting that this is all in one procedure.

If on the other hand, you need to split the execution of the code to optimize the result, maybe leave everything within the procedure can limit you. This way this aspect should also be considered for each function you need to draw in the system.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.