What is the INDEX index in Mysql for?

Asked

Viewed 14,534 times

15

What is the content for INDEX in Mysql?

ALTER TABLE `tabela`
ADD INDEX `tabela_id_index` (`tabela_id` ASC);
  • 4

    Related: http://answall.com/q/35088/101, http://answall.com/q/55118/101, http://answall.com/q/32052/101, http://answall.com/q/23348/101, http://answall.com/q/27877/101

  • Just for the record, "index" = "index" in English. There seems to be confusion in the question’s title.

  • @Dineirockenbach is not confusion not... Is that I would say: "Unique index" or "Primary index"... then I would also express myself talking about "index index"

3 answers

21


An easy way to think about right is to think of a book summary, it exists to facilitate the localization of book contents, so you find content easier than searching page by page, for this depends on the fact that the Dexes are well defined.


Benefits:

  • Are able to reduce the amount of data the server would have to examine.
  • Are able to help the server sort (e.g. ORDER BY) and avoid creating temporary tables.

The use of "are able" is because not all created Dexes will do this, because maybe they will never be used.


Types of Indexes:

These are the types of Dexes that exist in Engines most common:

  • B-Tree: Most common of all types of Dexes, basically with the exception of Archive Engine all bear him, but with internal differences.
    • Innodb Engine uses uncompressed B+Tree and are referenced by Primary Key.
    • Myisam Engine uses B-Tree with understanding techniques and are referenced by their physical location.
    • NDB Cluster Engine uses T-Tree.
    • Tokudb uses a variation called Fractal Tree.

  • Hash: Own name already explains, it is useful only for differentiation and equality (IN(), = and <=>), but is not used for any other functionality.
    • MEMORY Engine is one of the only ones that has explicit support for this type of index.
    • Innodb Engine possesses adaptative hash Indexes, when Innodb identifies that some values are accessed frequently it creates an index hash and uses it before B-Tree, but this is fully automatic and there is no way to turn it off or create it.

  • R-Tree: Rarely used in Mysql, but used in Postgresql and Postgis, it does not require the use of WHERE to benefit.
    • Innodb Engine is one of the only ones that has support for this type of index.

  • Full-Text Index: Have a different feature and only used for MATCH AGAINST.
    • Myisam Engine has support for this type of index.
    • Innodb Engine has support for this type of index in the latest versions.

To complement, Scaledb uses Patricia-Tries. Infinidb and Infobright have their own architecture (?).


Since the standard is the B-Tree, it is precisely in it that I will use as a basis for everything here.

Limitations:

  • They do not serve for readings that do not start from the left, for example the SELECT * FROM Tabela WHERE Nome LIKE '%A', to try to find all the names that end with A, but you can use A% to find what begins with A.

  • Your query must execute the same order the index set, create a KEY(Nome, Sobrenome) must inform in this order. You can WHERE Nome = 'Inkeliz', but you can’t use WHERE Sobrenome = 'Lima'. You must follow the order, that is WHERE Nome = 'Inkeliz' AND Sobrenome = 'Lima', for example.

  • Create a crease, for example, WHERE Nome LIKE 'ANA%' AND Sobrenome = 'Lima', The Dexes will be used to find Ana, but will not be used to find the last names, this will also create a temporary table. This indicates that you prefer should put X > 1, BETWEEN 1 AND 5 and LIKE as a last condition, i.e. all equality must come before (there are even "tricks"/"gambiarras" with the IN()), I detail this later.

Problems:

Isolate columns:

Consider this:

SELECT Nome FROM Usuarios WHERE ID + 1 = 10

See the problem? Mysql is not able to do this sum. For us, it is easy to know that ID + 1 = 10 is the same as ID = 9, but Mysql does not do this, as a consequence it will scan the entire database and add all the ID, ignoring any index.

Most real example:

SELECT Nome FROM Acessos WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(UltimoAcesso) <= 7

The same problem, you will run the TO_DAYS() in everything and then subtract and then compare it to see if it is less than 7. Indexes are not miraculous in these cases and will be ignored.

Create crease:

In Mysql use CAN IN() be better than BETWEEN, but beware of many conditions, for example:

SELECT Nome FROM Usuario WHERE Genero IN ('h', 'm') AND Idade IN (18, 19, 20, 21, 22, 23, 24, 25) AND UltimaAcao > DATE_SUB(NOW(), INTERVAL 7 DAY);

This would have 2 * 8 possibilities and then theoretically be better than:

SELECT Nome FROM Usuario WHERE Genero IN ('h', 'm') AND Idade BETWEEN 18 AND 25 AND UltimaAcao > DATE_SUB(NOW(), INTERVAL 7 DAY);

This is because using the BETWEEN you would generate two ranges the first for Idade and the second by UltimaAcao, while in the IN() is equality. This still raises another question, which of the two has the most relevance?! But, why use it? Genero IN ('h', 'm'), Assuming they’re the only options.

The gambit of IN() comes by the need to specify all columns, so if do: KEY(Genero, Pais, Idade) you will always have to inform the Gender, Parents and Age. That’s why we use the IN(), that is if there is only h and m then we do Genero IN ('h', 'm'), so we "ignore" the Genero and we can filter by parents, or we make a new IN() with all country lists and we passed to age. *Age last, again, because it has a potential power to create a crease.

In case we use:

SELECT Nome FROM Usuario WHERE Idade > 18 AND Genero = 'h' AND Pais = 'BR'

It would obligatorily ignore the Genero and Pais, because the first, Idade creates a crease.

Relevance and order:

Now which has more relevance, consider this:

SELECT * FROM Usuario WHERE Ativo = 0 AND id =  10;
SELECT * FROM Usuario WHERE id = 10;
SELECT * FROM Usuario WHERE Ativo IN (0, 1) AND id = 10;

The first and the last could use KEY(Ativo, id), but it would be incompatible with the second. But we could also do:

SELECT * FROM Usuario WHERE id = 10 AND Ativo = 0;
SELECT * FROM Usuario WHERE id = 10;
SELECT * FROM Usuario WHERE id = 10 AND Ativo IN (0,1);

So we could use the KEY(id, Ativo), they would all use Dexes, in this case it tends to be better, because the Asset, being binary will return more data than the id which tends to be unique, therefore the id has greater relevance and this has great impact.

Another thing is to know you won’t do it:

SELECT * FROM Usuario WHERE id > 10 AND Ativo = 0;
SELECT * FROM Usuario WHERE Ativo = 0;

Because if those wishes are used KEY(Ativo, id) could be better, but again depends on the case, so using the Ativo IN() before even just selecting the id.

Another problem is creating multiple Dexes separated by each column, this is terrible and are rarely useful.

Separate columns:

Creating separate Indexes can be good if you use:

SELECT * FROM tabela WHERE Nome = 'Inkeliz' OR Idade = 18

If create separate Dexes, Nome and Idade Mysql will make a UNION of the two results, using each separate index. However do this:

 SELECT * FROM tabela WHERE Nome = 'Inkeliz' AND Idade = 18

He will not benefit from both Dexes, he will only choose to use one of them, not both. Most of the time creating debt index is not good, in short.

The ORDER BY and GROUP BY also benefits from Dexes, but as long as it does not create a crease, because in this case I fall in a filesort even, there is no way, except make a RAID-0 of SSD.


Indexes are complex, have two books on this, unfortunately do not cover Mysql 5.6 and Mysql 5.7, and apparently there is no new version to be released, are they:

High Performance Mysql: Optimization, Backups, and Replication, has a version translated by Altas Books (but is older still and there are several complaints of flaws in the translation), in which it mentions the use of the IN(), is not my authorship (and gave a good reduced RAM, deleting repeated dexes).

Relational Database Index Design and the Optimizers, practically a book only right, this is still in line to be read, the book above indicates this too. :)

14

To create a índice (Indexes are used to find records with a specific value of a column quickly. Without an index Mysql has to start with the first record and then read through the entire table until it finds the relevant records)

Quote from the Locaweb

One of the most important items in database optimization is the index creation. With this procedure your query will be much more fast. Ideally create indexes in all fields used in "Where" and "joins" clauses of your wishes.

This Soen response gives a demonstration of how to create an index: Link

Comparison between CREATE INDEX and ALTER TABLEtableADD INDEX: Link

Index mysql documentation: Link

4

Indexes are auxiliary access structures associated with tables and that aim to increase the performance in the execution of queries.

there are two types of index: Clustered: this ordered sequentially.

Nonclustered:

An index whose search key specifies a different order from the sequential order of the file.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.