How to use indexes beyond the primary key?

Asked

Viewed 91 times

3

I have a question about that.

Consider a table with the following columns:

id (chave primaria),
cliente,
cidade,
estado

I already understood that a consultation with condition WHERE id = X would occur more quickly if that column were an index and the reason for that.

However, queries in this hypothetical scenario would not be based on an id filter, but rather on criteria such as city, state or even customer’s name.

In this case, therefore, it was not clear to me how the 'id' field would speed up the query if the searches will not use it as a criterion to filter/find something.

To improve the performance of the queries in this table, would have to transform

cliente,
cidade,
estado 

indexed?

  • If the indexed field is not directly involved in the query then the index will not be used. How much creating new indexes may or may not be productive, for example what you put: depending on the volume of data in your database an index on the status field may not be a good idea since each index entry may point to a list for a very large list of records made its use ineffective, recalling that the existence of an index burdens the processes of inclusion and update of its database.

  • Using tools like EXPLAIN in key queries can help you decide which indexes are worth creating.

1 answer

3


And I don’t understand why you think a id speed up the query if searches do not use this field. The question presupposes this and is a wrong premise.

Correct indexes help selections with filters that somehow use something contained in the index and yet not in any way.

An index is like a dictionary, it has something classified and because of this classification it is possible to do a binary search, the so-called divide algorithm to conquer then roughly, the index looks right in the middle of the items and checks if the die there is what you are looking for, if it is not he sees if it is smaller or bigger, if it is bigger then he knows that now has to look in the lower half and completely discard the upper half, so now he looks in the 1/4, which is half of half, and he goes on doing it successively until he finds what he wants. Obviously if the data find is smaller, then it goes to the upper half. Of course, the human is not so mathematical, but that’s basically how he quickly finds something in the dictionary. If the words are there without alphabetical order it would be very difficult to find something, probably just by looking word for word. The index is just that.

Another advantage is that you often want to catch everyone who is equal or else pick up the order that this index offers, so it goes walking one by one without worrying about the order that has been entered.

You can only search this way if what you’re looking for is classified, that is, it’s indexed. If you want to search all words that are adjectives you have to go one by one, you can’t use this dictionary index. Unless it had another book ranked so it would only indicate which main book page is each word. It can have several books like this, these are the secondary indexes. On paper this is almost unviable, digitally not.

One of the advantages of the index is that you can have several so you can take the main data and sort it in orders that meet several different demands. An index can be based on id then find something for a certain id is fast. Every time a where or other means involving a filter to use the id simply so the search will use the index that is classified by that id and will find the data in what is called logarithmic complexity or O(logN), so if you have 1 billion items to search it can find in only 30 queries (some cases less by improved algorithm or because you find the data before searching everything, usually only spend 30 if you don’t find it), is absurdly fast.

An index does not need to have only one field, it is possible to compose several fields, but this composition is like a concatenation, so the first field is the most command, the second serves as a tiebreaker for the first. And if you are looking for something in the second field the index cannot be used directly.

In other words an index estado + cidade is very different from an index of cidade + estado, Just to quote an example, not that he’s the best. In the first one what commands and can be optimized is the state, and the city serves only to disambiguate the search. In the second the city is in charge, and in some rare cases could use the second field to disambiguate because most cities are already unique. And if the database is normalized right the state doesn’t even make sense because the city is already unique (has a id and not her name).

The index would be something like this: SPSão Paulo, SCBlumenau. So when you go to consult for the state it’s easy, if you go to consult only for the city it can’t take advantage of the index because the city is not in the beginning, it’s like in the dictionary if you wanted to search for words for the second syllable, right? It looks one by one. The same goes if the fields are reversed on another index: São PauloSP or BlumenauSC, now finds by the city but not by the state. Remembering that are examples in non-standard database.

From experience I know that there are cases that have an index with state and then city makes some sense, but having city and then being not usually, but may have some case. If you have one with estado + cidade "never" needs to have a single estado, because nothing prevents you using this index with two fields to search only one of them, as long as it is the first.

So if you want to find quickly and possibly in city order you have to have an index with this field. He may be alone or he may have others, but the secondary can only be used together with the primary.

You don’t have to turn anything, you have to create suitable index to find faster what you want. I suggest a thorough study of the subject, otherwise you’ll always be trying to guess. One of the reasons people criticize the use of relational databases is that they don’t know how to use it properly. Then they use other things and have other problems, some not so apparent, and they think they are having some gain, but it is only illusion. Who does not know how to use a RDBMS will not be able to make a good decision than to use, and then it is better to adopt a standard solution, which is the same RMDBS :)

Indexes are the big secret of database performance. There are others, but this is the most important. Some points to help:

  • Thank you, @Maniero. I’m starting in SQL and, in the materials I had access to, the examples always transform the primary key as an index. Which doesn’t make much sense to me since, if I’m going to do a consultation, I’m hardly going to use that criterion in Where. Hence another question arises: in which scenarios would it be interesting to turn the primary key into index? Can you give me examples?

  • This is the most commonly used cryerium fired. The primary key is always an index.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.