Search performance involving multiple disks

Asked

Viewed 96 times

4

Supposing that a query using JOIN between two or more tables, such as:

SELECT *
FROM foo INNER JOIN bar ON foo.id = bar.id

There is performance gain if the table data is in different HD’s (due to having more heads performing reading at the same time, I suppose) ? If there is, how do I separate my tables into different storage drives?

And in the case of "natural" implicit/Join junction? I.e.:

SELECT *
FROM foo, bar
WHERE foo.id = bar.id
  • Answering your question is something that depends on very detailed knowledge about how SQL Server works. If it were an opensource database, most likely someone would have already done a deeper analysis. I can state, however, that there is gain by separating the tables into different disks. Some even separate the indexes from the table itself. The process of separating the tables on different disks is a very quiet thing to do on SQL Server. We do this on a system with database >350 GB. The audit table is in a separate raid and only it is >200 GB.

2 answers

4


If done correctly, there will be a performance gain (maybe minimal depending on your application). Implementing a database on multiple Hds is a bit complex and perhaps unnecessary for the size of your project.

Trying to summarize: When you perform a join between two tables SQL Server will query much more than the contents of the table, for example the indexes to try to find the necessary information. If these indexes are not implemented correctly, you will lose considerable performance.

Simpler ways to improve performance:

  • Put database log/backup files on another hard drive.
  • Create indexes for your main queries.
  • Delete unused records (those from the previous year that are there only to harm your query), or put them in another database.

If still you are interested in partitioning: http://www.toadworld.com/platforms/sql-server/w/wiki/9650.partitioning.aspx

Some tips on how to improve performance: http://www.toadworld.com/platforms/sql-server/w/wiki/10373.configuring-database-files-for-optimal-perfomance.aspx

2

As Wédney said, the improvement would be minimal. In fact, I don’t even know if there will be a real gain, after all I don’t believe that SQL Server can read the two tables in parallel or asynchronously for the same query.

Considering the Implementation Plan

Before starting to read the disk, database systems such as SQL Server first calculate the execution plan. The first step is to check if there is any index with which it can determine in advance which records should be read from the physical table.

If Join is performed only by primary keys on foo and bar, then surely there will be indexes. Having enough memory helps SQL Server to keep indexes in memory and thus avoid disk index readings. Therefore, until this step, no reading would be necessary.

With the execution plan calculated, knowing exactly which records should be read from the disk, SQL Server will then start reading the disk records. If they are on separate disks, it may not make a difference, after all it probably needs to wait for a table or record to read to then access another table.

Moreover, even if the two tables are on different disks, nothing guarantees that the movements of the HD needle will be more efficient. The data of the same table can be fragmented in sectors scattered by the tracks of HD.

Completion

Therefore, it may be much more efficient to simply compress your database in order to eliminate unused portions of the file, and then defragment the hard drive,

Browser other questions tagged

You are not signed in. Login or sign up in order to post.