Mysql very slow sorting for results with many records

Asked

Viewed 776 times

7

I have two tables, the first is used to group the data of the second with some information about the set, in order to facilitate the pagination of the contents.

Table 1:

 - id              INT(11)    NOT NULL    UNSIGNED    AUTO_INCREMENT
 - date            DATE       NOT NULL
 - total_records   INT(11)    NOT NULL    UNSIGNED
 - created_date    TIMESTAMP
 - updated_date    TIMESTAMP

 - date_index      INDEX      date DESC

Table 2:

 - id              INT(11)    NOT NULL    UNSIGNED    AUTO_INCREMENT
 - order           INT(11)    NOT NULL    UNSIGNED
 - content         LONGTEXT   NOT NULL
 - record_id       INT(11)    NOT NULL    UNSIGNED
 - created_date    TIMESTAMP
 - updated_date    TIMESTAMP

 - content_search  FULLTEXT   content

To tabela 2 relates to tabela 1 across the countryside record_id.

The appointments I’m trying to make are as follows::

-- SQL 1
SELECT t1.`date`, t2.`id`, t2.`order`, t2.`content`
FROM `tabela1` as t1
INNER JOIN `tabela2` as t2
ON t2.`record_id` = t1.`id`
WHERE MATCH(t1.`content`) AGAINST( '"palavras chaves" @5' IN BOOLEAN MODE)
ORDER BY t1.`date`
LIMIT 0, 10;

-- SQL 2
SELECT t1.`date`, t2.`id`, t2.`order`, t2.`content`
FROM `tabela2` as t2
INNER JOIN `tabela1` as t1
ON t2.`record_id` = t1.`id`
WHERE MATCH(t1.`content`) AGAINST( '"palavras chaves" @5' IN BOOLEAN MODE)
ORDER BY t1.`date`
LIMIT 0, 10;

-- SQL 3
SELECT *
FROM (
    SELECT t1.`date`, t2.`id`, t2.`order`, t2.`content`
    FROM `tabela2` as t2
    INNER JOIN `tabela1` as t1
    ON t2.`record_id` = t1.`id`
    WHERE MATCH(t1.`content`) AGAINST( '"palavras chaves" @5' IN BOOLEAN MODE)
) as tmp
ORDER BY tmp.`date`
LIMIT 0, 10;

-- SQL 4
SELECT *
FROM (
    SELECT t2.`id`, t2.`order`, t2.`content`, t2.`record_id`
    FROM `tabela2` as t2
    WHERE MATCH(t1.`content`) AGAINST( '"palavras chaves" @5' IN BOOLEAN MODE)
) as tmp
INNER JOIN `tabela1` as t1
ON tmp.`record_id` = t1.`id`
ORDER BY tmp.`date`
LIMIT 0, 10;

Using a EXPLAIN in each one it is possible to see that it is using file ordering and temporary tables, that is to say Using filesort and Using temporary.

For large keywords, with five words for example, the query and ordering is extremely fast, even because few records return. But if I use only one keyword which results in multiple records, the query is quick, but the ordering takes too long. In any situation the consultation is quick, but with the ordering these consultations take on average 156 seconds.

I tried to change the tabela 2 denormalizing and placing a copy of the field date to eliminate the INNER JOIN. The query improved significantly by dropping to an average of 56 seconds in single-keyword queries. But it still has problems in ordering, even creating an index for the date with decreasing ordering.

These problems occur in Mysql 5.6 and 5.7, as I switched the engine from Myisam to Innodb. Myisam was in no condition to slow and with this change to search without ordering, they were instantaneous regardless of the number of keywords. In Mysql 5.5 com in Myisam was very fast and working perfectly in all situations.

After changing the tabela 2 and put the date field, my queries were as follows

-- SQL 1
SELECT *
FROM `tabela2` USE INDEX (`tabela_2_date_desc`)
WHERE MATCH (`content`) AGAINST ('"palavra" @5' IN BOOLEAN MODE)
LIMIT 0 , 10;

-- SQL 2
SELECT *
FROM `tabela2`
WHERE MATCH (`content`) AGAINST ('"palavra" @5' IN BOOLEAN MODE)
ORDER BY `date` DESC
LIMIT 0 , 10;

The first query returns instantaneously but is not ordered, the second query returns with an average of 55 seconds.

Below the EXPLAINS

-- SQL 1
1   SIMPLE  tabela2     fulltext    search_content  search_content  0   const   1   100.00  Using where

-- SQL 2
1   SIMPLE  tabela2     fulltext    search_content  search_content  0   const   1   100.00  Using where; Using filesort

Someone can help me optimize, without using another type of bank or search engine like Lucene?

I’ve tried to take advantage of SQL_CACHE.

To tabela 1 has about 1000 records and the tabela 2 certain of 160,000.

  • Perhaps the solution by adjusting memory usage by Mysql http://dev.mysql.com/doc/refman/5.7/en/memory-use.html

  • You’re still in trouble?

  • I am, I should probably use another indexing system

  • @marcusagm implemente Partition table. I have had similar situations to table with more than 2mi of records.

  • Hello, try to create indexes for the main most requested fields in the table, will further improve the performance in the speed of the result.

  • Thanks @Wilsonrosagomes, but the problem is precisely with a type of index. The necessary indexes have already been created.

  • Ask you two questions, which engine used in your bank, Innodb? The values to be searched can be mapped, that is, they have patterns that allow identification via regular expression?

  • 1

    Is Innodb, as said in the question, only because of the version of Mysql, because in version 5.7 Innodb is faster, I do not understand why, but stayed. And no, you look for whatever you want in a text, so you can’t treat it with regular expression. Unfortunately the solution was to use another tool more suitable for indexing

Show 3 more comments

1 answer

2

If you enter data in this table by filling in the column date with now(), try to sort the table 2 for id, which is sequential and will likely follow the date sequence.

If you have imported these disordered tables, I recommend you create another table already sorted by date and subsequently revalidate yours for performance analysis.

  • The idea is good, but I have to sort the dates down. Even though the ID allows you to sort the dates correctly, the process uses many index pages, leaving it very slow. The same solution is to use something more appropriate than mysql for this type of query. Thanks for the idea, had already tried =\

  • The id is as index in tables as well?

  • Yes as Primary key

  • Knowing the pros and cons, it’s not legal allow the user to do a search in the entire database as it is already predicted slowness as mentioned by you. What would help your search would be:

    • <code>SELECT t1.date, t2.id, t2.order, t2.content FROM tabela2 as t2 INNER JOIN tabela1 as t1 ON t2.record_id = t1.id WHERE t1.date between 'DATA_INICIO' and 'DATA_FIM'</code>
  • and save to the temporary table.

  • Yes, but since it is a mandatory search for terms in the texts, I preferred to use an indexer.

  • and later would search with match in this restricted period. Currently I work with tables with data from 3 years up and we limit the research to a period of 30 days, avoiding the loss of performance caused when there is no limitation of the period.

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.