select like over 3 million lines

Asked

Viewed 402 times

6

I have a Maxmind table, with more than 3 million cities. I created the indexes for a simple select and is at satisfactory performance.

The problem is that I am using an autocomplete and for that my select uses like, but always returns timeout (Maximum Execution time of 30 Seconds exceeded).

I chose to use MyISAM as a table with low or no changes.

From what I’ve been reading, 3 million is not an excessive amount, but I don’t know how I should proceed to optimize this search.


CREATE TABLE `cidades` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `pais` char(50) DEFAULT NULL,
  `regiao` char(100) DEFAULT NULL,
  `cidade` char(100) DEFAULT NULL,
  `titulo` char(100) DEFAULT NULL,
  `lat` char(50) DEFAULT NULL,
  `lon` char(50) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ID` (`id`),
  KEY `CITY` (`cidade`),
  KEY `ID_2` (`id`,`cidade`,`titulo`)
) ENGINE=MyISAM AUTO_INCREMENT=3166036 DEFAULT CHARSET=utf8;

select cidade.titulo
from city
where cidade.titulo like '%Rio de Janeiro%'
  • In fact, I had tested with like 'Rio de Janeiro%' and the response time was virtually identical to select. Then the problem would be the % at first? This way he makes use of the indexes?

1 answer

5

Cause of poor performance

Use % at the beginning of the like causes a full table scan. It prevents the DBMS from using the table index, needing to read the field completely at all times. This then effectively becomes an excessive operation for 3 million line.

Limited solution

Removing the % from the start, the DBMS will then be able to pick up the first characters before the % is to directly analyze the index to make an initial filter of the results. Only then will it test the like against the full value of the few selected lines.

Mysql even has an option to define how many characters should be included in the text fields index.

Withdraw the % from the beginning solves the performance problem, only that it may not meet the requirement if the auto-complete is of the type that partially searches the entire text. Then it would be better to change the approach, that is, not to use like.

Complete solution

An alternative would be Full-Text Search provided by Mysql. I have no performance data as I have not used this feature in Mysql yet.

Another more advanced alternative would be to use a tool like the Apache Lucene, a search engine developed in Java used by many sites and projects and that can be integrated into your back end with PHP.

  • 1

    These are two options I didn’t know. I’m going to read them now. TKS.

  • 1

    It is worth mentioning most popular own indexing engines like Elasticsearch.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.