Mysql does not use Input in query (Inner Join)

Asked

Viewed 309 times

5

I have a query that is taking time to be executed, analyzing the explain I see that Mysql is not using the Word in one of the tables.

Tables:

CREATE TABLE `rel_financeiro` (
  `protocolo` char(13) NOT NULL,
  `aceito` datetime DEFAULT NULL,
  `processado` datetime DEFAULT NULL,
  `valor_previsto` decimal(10,2) NOT NULL DEFAULT '0.00',
  `valor_pago` decimal(10,2) NOT NULL DEFAULT '0.00',
  PRIMARY KEY (`registro`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `emails` (
  `id` char(36) NOT NULL,
  `data` datetime NOT NULL,
  `de` varchar(200) NOT NULL,
  `para` varchar(200) NOT NULL,
  `protocolo` varchar(13) DEFAULT NULL,
  `assunto` varchar(255) NOT NULL,
  `pasta` varchar(60) NOT NULL,
  `lido` tinyint(1) NOT NULL DEFAULT '0',
  `headers` text NOT NULL,
  `mensagem` text NOT NULL,
  PRIMARY KEY (`id`),
  KEY `protocolo` (`protocolo`),
  KEY `pasta` (`pasta`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Query:

SELECT r.protocolo, r.aceito, r.valor_previsto
  FROM rel_financeiro r
 INNER JOIN emails e ON r.protocolo = e.protocolo
 WHERE e.id IN ('e665a3e5-098f-a754-d4fe-5602a15aa191', '39b191b3-8f04-11e5-b9b8-040166406e01', '3e6c68ee-8f04-11e5-b9b8-040166406e01');

Explain result:

+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type  | possible_keys     | key     | key_len | ref   | rows    | Extra       |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
|  1 | SIMPLE      | e     | const | PRIMARY,protocolo | PRIMARY | 108     | const |       1 |             |
|  1 | SIMPLE      | r     | ALL   | NULL              | NULL    | NULL    | NULL  | 5197139 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+

I’ve tried so many different ways and so far nothing.

  • Tarcio, glad you solved your problem. If you want to post a reply with the commands INSERT \ SELECT who solved their problem I’m sure the information will be welcome to the community. That said, it would be nice to get some dumps that reproduced the problem, in that way we came out of this impasse of a no-question with two or three no-answers.

  • When I created the tables hiding confidential information (as you suggested), everything worked, then I came up with the solution, it’s crazy, but that’s it. If I provide a dump it will be just from the table without the problem.

  • Tarcio, I reversed your edits. The first invalidated the existing answers. The second is better to post there below, in the area of answers. It is OK to answer the question itself, but you need to use the bottom part, and not embed the answer in the question.

2 answers

3

Your create table is setting a primary key with a column that does not exist, registro. Maybe the intention was

  PRIMARY KEY (`protocolo`)

If the desired primary key is the one already defined, create an index:

  KEY (`protocolo`)

or single index

  UNIQUE (`protocolo`)

As appropriate.

  • The correct KEY is protocol, I did so many tests changing the structure of the table that I didn’t even notice the error, thanks for fixing.

3

Assuming protocolo be the PK of rel_financeiro as pointed out in reply of fbiazi, the problem seems to be the position of the Join:

SELECT r.protocolo, r.aceito, r.valor_previsto
  FROM emails e 
 INNER JOIN rel_financeiro r ON r.protocolo = e.protocolo 
 WHERE e.id IN ('e665a3e5-098f-a754-d4fe-5602a15aa191', '39b191b3-8f04-11e5-b9b8-040166406e01', '3e6c68ee-8f04-11e5-b9b8-040166406e01'); 

EXPLAIN query original

You want the from be the table emails for the clause where use the index of id and reduce emails for a small data set. After that the join can use the index in protocolo.

Of course the optimizer of darlings could do his part and reach a similar conclusion alone, but there are no guarantees in this case. See the result I received with his query original:

EXPLAIN query alterada

Mysql is being very literal, bringing all the financial relay to memory first and only then trying to do the join of the result with the emails.


Functional example in SQL Fiddle

  • Running in an empty table the result is really this, but when we have records (5 million in one table and 300 thousand in the other) back not to use the KEY, even with your example (that really is better than mine)

  • 5 million in financial relay and 300k in emails? I think the optimizer is getting lost... Ever tried to run ANALYZE TABLE in both tables and repeat the query?

  • yes I have already run the ANALYSE TABLE, and nothing has changed, I have recreated the tables in another base, taking a dump with the data and nothing.

  • 2

    I’d still use an Outer Join in that case, not that I think it’ll change the problem. More specifically a LEFT JOIN with the email table on the left (I imagine the Planner already do this automatically, however). Note: if there is expectation of many sequential readings, Mysql abandons the index in favor of full scan. A good benchmark would run without the IN, using =, and maybe making separate queries for each id.

  • 1

    I agree with Bacco on the LEFT JOIN. It’s very difficult to test without a data load. You can generate a mass of data fake to simulate the problem? (In the worst case a dump compressed after masking all confidential data).

  • Now the thing has gotten crazier... I generated a database hiding confidential data (such as content of emails, addresses etc), and surprise the index works with these new test tables that I created. The structure is exactly the same, the only difference is content of some columns that are not even being used in the query.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.