LEFT JOIN with WHERE in the field of the first table OR field of the second table

Asked

Viewed 663 times

1

SELECT COALESCE(tb2.numero_novo, tb1.numero_original) as numero
FROM
tb1
LEFT JOIN tb2 ON tb2.id = tb1.id
WHERE 
tb1.numero_original = <PARAM> OR tb2.numero_novo = <PARAM>

The above query generates a full table scan. How to optimize this query?

EXPLAIN:

+----+-------------+-------+-------+----------------+----------------+---------+------+----------+----------------------------------------------------+
| id | select_type | table | type  | possible_keys  | key            | key_len | ref  | rows     | Extra                                              |
+----+-------------+-------+-------+----------------+----------------+---------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | tb1   | index | numero_original| numero_original| 5       | NULL | 11683843 | Using index                                        |
|  1 | SIMPLE      | tb2   | ALL   | PRIMARY        | NULL           | NULL    | NULL |        2 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+-------+----------------+----------------+---------+------+----------+----------------------------------------------------+

The problem I need to solve is:
tb1 is a table containing the field numero_original. I want to create the table tb2 to store a new value for this field, (which I called numero_novo). I did it because I can’t change the field value numero_original, and would not want to create the field numero_novo on the table tb1, because it would be NULL in over 95% of the 11 million record table.

EDIT:

As they talked about indexes, I’m putting more information:

  • tb1 has as primary key the field id;
  • The field numero_original table tb1 has an index;
  • tb2 has as primary key the fields tb1_id and numero_novo;
  • The table tb1 has 11mi+ records;
  • The newly created table tb2 has 2 test records.

.

mysql> show indexes from tb1;
+-------+------------+-------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name          | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tb1   |          0 | PRIMARY           |            1 | id              | A         |    11066164 | NULL     | NULL   |      | BTREE      |         |               |
| tb1   |          1 | numero_original   |            1 | numero_original | A         |     5533082 | NULL     | NULL   | YES  | BTREE      |         |               |
+-------+------------+-------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

mysql> show indexes from tb2;
+-------+------------+------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name   | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tb2   |          0 | PRIMARY    |            1 | tb1_id         | A         |           2 | NULL     | NULL   |      | BTREE      |         |               |
| tb2   |          0 | PRIMARY    |            2 | numero_novo    | A         |           2 | NULL     | NULL   |      | BTREE      |         |               |
+-------+------------+------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

EXPLAIN by selecting with WHERE only in the first table (tb1):

+----+-------------+-------+------+----------------+----------------+---------+-------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys  | key            | key_len | ref   | rows | Extra                                              |
+----+-------------+-------+------+----------------+----------------+---------+-------+------+----------------------------------------------------+
|  1 | SIMPLE      | tb1   | ref  | numero_original| numero_original| 5       | const |    4 | Using index                                        |
|  1 | SIMPLE      | tb2   | ALL  | PRIMARY        | NULL           | NULL    | NULL  |    2 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+----------------+----------------+---------+-------+------+----------------------------------------------------+

EXPLAIN by selecting with WHERE only in the second table (tb2):

+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                     | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+
|  1 | SIMPLE      | tb2   | ALL    | PRIMARY       | NULL    | NULL    | NULL                    |    2 | Using where |
|  1 | SIMPLE      | tb1   | eq_ref | PRIMARY       | PRIMARY | 4       | sistema.tb2.tb1_id      |    1 | NULL        |
+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+
  • Is it Mysql? SQL Server? Don’t forget to [Edit] the tags of your question.

  • Mysql. Already updated, thanks.

  • All you care about are the lines of tb2 where there is ID correspondence with the tb1? In that case you could use one INNER JOIN instead of the LEFT.

  • 1

    I need the lines of tb1, even if there are no lines in the tb2

  • @Raulbaldnerjunior as only explain SELECT COALESCE(tb2.numero_novo, tb1.numero_original) as numero&#xA;FROM&#xA;tb1&#xA;LEFT JOIN tb2 ON tb2.id = tb1.id&#xA;WHERE &#xA;tb1.numero_original = <PARAM>

  • @gmsantos edited the question to add this information

Show 1 more comment

1 answer

0


Hello, Baldner

Check if this helps you:

/* Adquirindo os registros da tb1 que possuem registros na tb2, ou seja, os 5% */
SELECT tb2.numero_novo AS numero
  FROM tb2
 INNER JOIN tb1 ON tb1.id = tb2.id
 WHERE tb2.numero_novo = <PARAM>
/* Realizando união com os registros da tb1 que NÃO possuem registros na tb2, ou seja, os 95% restantes */
 UNION
SELECT tb1.numero_original AS numero
  FROM tb1
 WHERE tb1.numero_original = <PARAM>
   AND NOT EXISTS(SELECT TOP 1 1 FROM tb2 WHERE tb2.id = tb1.id)

My idea was to work with the 5% that have records in tb2 using INNER JOIN that is faster than LEFT JOIN.

The other 95% records of tb1 that do not have records in tb2 just use NOT EXISTS instead of LEFT JOIN.

The Union would have the role of the improved AB.

Let us know if this has improved or worsened.

I hope I’ve helped :)

  • Thanks for the answer! Unfortunately, I did not intend to use UNION because I need to use other search parameters in tb1. I would need to duplicate the parameters in the 2 SELECTS or else put this UNION as subquery and filter once only after (which I think would be very bad).

  • I don’t know what other parameters you are needing but, depending on the case, it is worth doubling in the two queries. Why don’t you clear the doubt by performing a performance test ?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.