Are unnecessary bank ratings a problem?

Asked

Viewed 2,772 times

24

Indexes are usually added to improve the performance of queries. However, it is not usually recommended to add indexes in all columns, only where the need is identified.

Why this recommendation? What kind of unnecessary index problems can cause?

1 answer

33


Basically it’s because each new index requires more time to update the bank on any change affecting these indices. I’ve seen people suggest doing indices for everything. With only 6 or 7 fields, for all combinations tens are required for hundreds of indexes and all would need to be updated even if you change only one field. Time spent can affect bank scalability.

In some cases the consultation may be impaired as well. The occupied space will harm the cache the data/indexes that are really important. In addition to accessing the index first and then accessing the data has a cost and it can be greater than accessing the data directly in certain patterns. And not always the system optimizer correctly detects whether it should do direct access.

Similarly the automatic recommendations of Dbms systems to create indexes are not always correct.

In addition each new index is a more use for maintenance. Violates the YAGNI. Even makes it difficult upgrades on the model of a database in the system in production.

I’ve seen "respected experts" say that the ideal is to create index for everything and take away those who do not need. I can’t imagine how this makes sense. It is the worst case of premature optimization I have ever seen. All optimization should be created when you measured and it became clear that it is necessary. Index exists fundamentally to optimize access to data.

The most complete database systems have many resources to obtain the information necessary for the right decision. This information can help identify unnecessary indices as well. And as it is a feature to optimize it is necessary to review frequently, what can be good at a time may not be more in the future.

A typical example of index creation error is when it is used exclusively for generating sporadic reports. The loss to generate a sporadic report without index optimization usually causes less damage than the index will cause in the system as a whole.

But one of the worst exaggerations I’ve ever seen, and it happens a lot, is to create an index for colA+colB and another to colA. The first can be used to obtain the same optimization that the second provides, making it unnecessary.

But something I always need to emphasize: measuring is the most important thing. I and even professionals much more experienced in the subject I live having surprises. What seems to help often harms more.

Good reference.


To avoid misunderstanding of the other answer (which has been removed but I think the information is still relevant) by people who are learning about the subject I will put a few points:

  • The loss of performance caused by the creation of unnecessary indexes occurs in whichever operation that changes a data contained in the index keys. It does not matter if it is INSERT, UPDATE and even DELETE depending on the implementation. Changes that do not affect the index key do not cause performance problems. In database systems that use some form of MVCC any change ends up affecting at least the primary key index (which may be clustered) since the data is copied to another location.

  • The performance of SELECT may also be adversely affected as quoted above in my reply. It does not occur frequently, mainly because the planner usually avoids the use of the index in these cases, but can.

  • The creation of the index benefits any data access operation that the database planner identifies that is useful. Usually this occurs in comparisons of WHERE but not in any comparison (this is important). There are cases where the index does not help. Index is not magic, it has an organization that optimizes some relationships, not all. It also occurs when using ORDER BY, JOIN and when there is use of some aggregating functions. But it is good to be clear that only one measurement can guarantee that there will be gain. Of course, the experience can make the developer of the database identify simple cases, but care must be taken. The good swimmer usually takes risks and drowns more than the swimmer who is afraid of unknown water. General rules, as always, can cause more harm than good. Always have to analyze the specific case with specific information for the case.

    Examples where the index probably won’t help: WHERE colA = colB / 2 or WHERE colA LIKE '%Silva%'.

  • Either way the gain occurs in any of the data access operations that have some use of the above clauses. Of course in UPDATE and DELETE only the WHERE and the ORDER BY is relevant.

It is full of myths in software development and as a consequence we have many software with problems because there is the reading of illusory information on the subjects. Leaving room for wrong understanding causes more harm than good.

  • 14

    I wanted to understand the negatives. What’s wrong with the answer?

  • 6

    must be someone lazy to read...

  • 3

    @bigown We have two creators of unnecessary indices on the site (should be from the staff of good practice :D )

  • For reports that use many where's would be interesting to use some indexing software like Elasticsearch to search for example name, email, phone, start date, end date. Then take the Ids of these documents (which are the Ids in the database) and later make a query passing only these Ids using WHERE id IN (1,2,3,7,9,N...) ? http://answall.com/questions/77529/usando-elasticsearch-para-fazer-buscas-complexas-e-depois-pegar-os-ids-para-cons

  • @bigown Your quote "an index to colA+colB and another to colA", at least to Postgresql valley to the WHERE colA = 1 AND colB=2 but not to WHERE colA = 1 OR colB=2, in the second case the index created separately for colA will be used. (first line of the link) https://www.postgresql.org/docs/9.1/static/indexes-bitmap-scans.html Plus, very good response. + 1

  • @Tiagooliveiradefreitas some database do some extra optimizations even.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.