How to apply Dexes to improve the performance of the queries?

Asked

Viewed 1,791 times

15

I’m looking to increase the performance of my database, which has over 1 million records.

Research I have undertaken

Doubt

The idea of cluster it seemed very tempting to me, but it seems a somewhat drastic measure, and unnecessary. So I would prefer to remain with index, only I have one question:

  • It makes sense to have more than one column in one index? The other columns wouldn’t have to go through all the records anyway because it wasn’t the main one?

Example:

CREATE TABLE tb_item(
    cd_item integer NOT NULL DEFAULT nextval('tb_item_cd_item_seq'::regclass),
    ds_item character varying(30),
    CONSTRAINT pktb_item PRIMARY KEY (cd_item)
);

CREATE INDEX "tb_item_cd_item_Idx" ON tb_item (cd_item, ds_item);

SELECT
    cd_item
FROM
    tb_item
WHERE
    ds_item ILIKE 'B%';

Even though I’ve been searching for ds_item ILIKE 'B%'; in my view it would have to go through the entire table, since the item of cd_item = 570000 may have ds_item = 'bola'.

In this case it would be better to do so?:

CREATE INDEX "tb_item_ds_item_Idx" ON tb_item (ds_item);

In short my question is:

It’s more enjoyable each time SELECT create a INDEX with the main column, or have a INDEX with all main columns?

  • Now that I saw that a reward was placed. What happened to her? Even if it was not given manually, I think I would receive 50%.

  • @bigown Here it does not appear, but I had put as "one or more answers...", and I thought it would enter automatic, because it did not put manually. I think it has to do with Gabi.

  • I asked at the finish line: http://meta.pt.stackoverflow.com/q/4794/101

  • William, rewards must be assigned manually. Automatic assignment exists only in certain cases, given certain conditions.

1 answer

16


Index clustered

I don’t know if you understand the term clustered index, I imagine you’re thinking of indexes spread by machines on nodes of cluster, that’s not it.

This is a term used to indicate that the index is the data table itself. That is, the table data is recorded in the order of this index, so the table itself is used as the index. This has advantages and disadvantages, but in general the advantages outweigh the disadvantages. Obviously the indexes do not clustered are those who need an auxiliary structure to store in the desired order. Using logic it becomes clear that a table can only have one index clustered.

To learn more you can read the wikipedia article or the SQL Server documentation which implements this technique more explicitly.

So there’s no exaggeration in that, on the contrary, this practice is the most common when the database offers it.

Multicolumn index

It is extremely common to have multiple columns in an index, whenever there is some reason to access the data in the order composed of more than one column, this should be done.

Remember that you don’t need to use all the columns of each index in your searches. Of course, the index can only be used effectively if the entire search contains the first columns. Then you search only for the first column uses the index. Searching only for the second column has no way to use it. Then the solution is to have another index with the second column of this as the first of another index.

Indexes to use with LIKE may work well when you have a text followed by %, but if you have a text after the symbol %, index cannot help. It is the same principle explained above, it is a break of the search sequence.

Think of a dictionary like Aurelius, you find words easily, right? Because words are on an index clustered. You know the order and you know you can do a search similar to binary search. But if you only know the end of the word, you can do this search efficiently?

Of course putting in additional columns just by putting in won’t help anything. You have to have discretion. You have to understand this process that at the bottom when you create an index with multiple columns is creating a huge word that makes sense if it’s in that order. And if you look binarially it can be found easily. The search works well while you have a logically organized data sequence.

Place columns that cannot be used in the order described in the index in any kind of query expressive that will be used will only consume space in the index, is a contrary.

It is not easy to determine this without seeing the specific case. Even if it may be useful for a query, may be unnecessary. In some cases the gain may be too small to compensate for. Having experience helps determine this. But it can get in the way. The right way is to test. See if all situations work well with or without index. So the best answer I can give you to the main of your question is: test. See if the darlings benefit from the index.

Of course, in the beginning you will spend time with indexes that help in anything or even hinder. If you are professional, in the best sense of the word, time will make you select better and avoid unnecessary tests without leaving aside those that can bring important results. But without testing even the database indexing geniuses can guarantee what works well in specific situations.

And keep in mind that what is good in a moment may not be more in the future. The darlings can change, the usage load of each changes, the data pattern can make a good index be bad and vice versa. There’s no magic rule.

When you have an index with UNIQUE it is a little more difficult to use other columns after the column with uniqueness restriction.

Note that some people think that putting all columns in an index or creating an index for each column solves all situations. Besides this being an obvious waste, it does not solve all situations (and nor should it, it should only solve what will be necessary and will produce real efficient results). It is virtually impossible to handle all combinations (except with 2 or 3 fields) since the darlings may be looking for data in a partial way or manipulated in some way. Even if you restrict the possible solutions only to the full columns it is not easy to combine all columns unless there are just over 2 or 3 columns.

I’ve written about it in a few responses about the advantages and disadvantages of indices, the index abuse and a specific case it seemed that the index would help but the tests showed that.

Your example

It seems to me it would be useful to have the CREATE INDEX "tb_item_ds_item_Idx" ON tb_item (ds_item);. Sure I just give testing on the real case, seeing in production what happens.

If it should be additional to the other that you should already have with two columns I cannot guarantee. Then I would need to see if there is usefulness in other darlings for the index (cd_item, ds_item) In this query shown it does not fit. And it seems to me that any index you use cd_item will not benefit from columns after it, at least not directly.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.