What are the advantages and disadvantages of using indexes in databases?

Asked

Viewed 25,876 times

36

What are the advantages and disadvantages of using indexes in databases?

2 answers

47

Perks

  • Improves performance consultation in many cases

    Data access is greatly reduced. The way the index is mounted allows searching a part of the data. The most common is the use of binary tree keeping the data in order (but there are other types), so a binary search can occur with complexity O(log n), ie, you can find information available in index on a larger scale with the increase of data. Example: If you have 50 lines you can get to where you want in up to 7 steps, without the index it would take up to 50. With more than a billion lines, you achieve what you want using index in just over 30 steps. The difference is brutal.

  • Can bring specific data faster

    If what you’re asking to bring (fetch) is all in index it is possible that only a query to the index is sufficient to obtain the data, without having to consult the data table. And as the indexes are smaller and used more often tend to stay in memory. This can be a huge advantage.

  • Allows access to ordered data without the cost of ordering when needed

    Even to get the performance indicated in the first item the data is naturally sorted and this is often what you want. When you know that you will often use sequential data access in a certain order, the index will help a lot. Otherwise, a temporary table (has ways to optimize this, but only to help a little) should be created and the whole sorting process should be done on time. And he’s very expensive.

  • It is easy to ensure that key information is not duplicated

    As the search is quick and simple, finding out if a key already exists is a very cheap operation, at least compared to discovering the same without index.

It is noteworthy that the same great advantage is the first item, the others are collateral, although of great importance too.

Disadvantages

  • Worsens database writing performance

    Every time a key information is modified (inserted, changed, deleted) it will be required to write in the index. And the index can be interpreted as an additional table hidden in the database. And if the modified information is present in several keys (several indexes), all of them must be changed (in inclusion and removal, all are always affected, even if it is possible to optimize for removal, otherwise the reading cost will increase). The change in the index implies reading and writing access to it, although it is an efficient operation when compared to direct access to the table, it still has a cost additional.

  • Increases the storage space consumption of the database (memory and disk)

    Of course this additional table of index keys will take up an extra space as well. It is usually a smaller space than the original data table but there is an extra cost. If there are many indexes it is possible that the space is even bigger than the original table. With much index it is difficult to put everything in memory.

  • Increases need for internal database maintenance

    This is somewhat dependent on implementation but it is common for key pages to be abandoned as they change. In addition the DBA may have more elements to worry about.

  • Can decrease query performance

    There are no guarantees that all queries will be faster with the use of indexes. As there is an additional transaction for access to the index before access to the main data it is possible that the sum of the time spent in the operations is greater than the access only to the main data even if the access in the main without index is theoretically less efficient. This is more common when the volume of data is small but this is also true in cases of complex queries or where a large portion of the table data will be returned in any order.

It is virtually impossible to create indexes for any key other than in extremely simple tables. So it is illusion to think that indexes will solve all problems. And even if it were possible, they would cause more harm than good. You should only create indexes when they really are necessary and be proven that they are helping.

Tips for choosing what to index

  • Avoid creating too many indexes if there is too much writing on the tables involved, index helps a lot to give performance for reading but impairs writing.
  • Always analyze the usage pattern to choose the best type of index. Modern sgdbs allow indexes that help certain types of OLTP or OLAP access.
  • Smaller keys with no repeats and no nulls are usually better choices. For cases where this is not possible see if the DBMS allows filtering to be keys.
  • When using more than one column choosing the order can help a lot in the results of more than one query.
  • Measure before creating an index and avoid them in tables too small.
  • The primary index should preferably be sequential and immutable, in addition to the obviousness of being single key.

It is good to point out that there are several types of indexes. The most common is the binary tree in its various variations, each with a more suitable situation. But other indices can be very useful as well, such as the hash where the key is determined by a formula and access is done directly by a positional index (most common in memory) or inverted indexes widely used to index texts where the words contained in the text are used as keys and not whole data as is common in other indexes.

To learn more about the functioning of the index see that answer.

Now you have a more complete.

  • Thank you very much, it helped a lot!

5

Basically, the advantage is a higher speed to access the data, and the disadvantage is a lower performance to perform insertions and changes in the database.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.