在WHERE子句中使用OR缓慢JOIN查询 - 缺少可能的索引?

时间:2017-09-15 21:16:24

标签: mysql left-join query-performance

我试图检索分页列表和"通知的总数"关于"案例"属于特定用户。

通知有一些条件是"没有锁定","不是私人","还没有看到"并且应该返回#found,然后按创建日期降序排序。

最后一个条件是用户自己没有创建通知,或者通知属于"行为" (枚举)和user_id在通知中引用" ref_id"

此查询需要5秒以上的时间才能在recent_changes中运行200k行,在cases和50位用户中运行少于4k行。

+-----+
| cnt |
+-----+
|  13 |
+-----+
1 row in set (4.67 sec)

此查询可以自行优化,还是需要重组?

SELECT count(*) as cnt
 FROM recent_changes rc 
 LEFT JOIN `case` c on c.id = rc.case_id 
 LEFT JOIN `user` u on u.id = rc.user_id
 WHERE 
 (
   rc.user_id != c.user_id AND c.user_id = '25'
   OR
   (rc.type = 'conduct' AND rc.ref_id = '25')
 )
 AND c.locked = 'N'  AND rc.private != 'Y' 
 AND seen = 'false'
 ORDER BY rc.datecreated DESC;

解释输出

+----+-------------+-------+--------+--------------------------+-------------------------+---------+--------------------------+------+------------------------------+
| id | select_type | table | type   | possible_keys            | key                     | key_len | ref                      | rows | Extra                        |
+----+-------------+-------+--------+--------------------------+-------------------------+---------+--------------------------+------+------------------------------+
|  1 | SIMPLE      | c     | ALL    | PRIMARY,user_user_id_idx | NULL                    | NULL    | NULL                     | 3699 | Using where; Using temporary |
|  1 | SIMPLE      | rc    | ref    | idx_recent_changes_case  | idx_recent_changes_case | 5       | xxxxxxxxxxxxx.c.id       |   25 | Using where                  |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                  | PRIMARY                 | 4       | xxxxxxxxxxxxx.rc.user_id |    1 | Using index                  |
+----+-------------+-------+--------+--------------------------+-------------------------+---------+--------------------------+------+------------------------------+

recent_changes的索引:

+----------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name                     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| recent_changes |          0 | PRIMARY                      |            1 | id          | A         |      182807 |     NULL | NULL   |      | BTREE      |         |
| recent_changes |          1 | recent_changes_user_id_idx   |            1 | user_id     | A         |          96 |     NULL | NULL   | YES  | BTREE      |         |
| recent_changes |          1 | idx_recent_changes_user_case |            1 | user_id     | A         |          92 |     NULL | NULL   | YES  | BTREE      |         |
| recent_changes |          1 | idx_recent_changes_user_case |            2 | case_id     | A         |       18280 |     NULL | NULL   | YES  | BTREE      |         |
| recent_changes |          1 | idx_recent_changes_case      |            1 | case_id     | A         |        7312 |     NULL | NULL   | YES  | BTREE      |         |
+----------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

case表上的索引:

+-------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name         | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+
| case  |          0 | PRIMARY          |            1 | id                  | A         |        3753 |     NULL | NULL   |      | BTREE      |         |
| case  |          1 | id_idx           |            1 | member_id           | A         |        3753 |     NULL | NULL   | YES  | BTREE      |         |
| case  |          1 | user_user_id_idx |            1 | user_id             | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |
| case  |          1 | case_ha_id       |            1 | health_authority_id | A         |          28 |     NULL | NULL   | YES  | BTREE      |         |
+-------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+

它在概念中做了以下几点:

在recent_changes中查找最近的行:

i)recent_changes行通过case_id加入case表,该表由当前user_id拥有 ii)并且当前user_id

未创建recent_changes行

OR

i)recent_changes行是"行为" type和当前user_id位于recent_changes.ref_id列

如果我删除" OR(rc.type ='行为' AND rc.ref_id =' 25')"条件然后我得到< 1s响应时间。

如果我删除" rc.user_id!= c.user_id AND c.user_id =' 25' OR"条件仍然需要大约5秒才能完成。

修改

虽然我无法在case。case_id上​​rc加入rc,但我已加入SELECT count(*) as cnt FROM `user` u LEFT JOIN `recent_changes` rc on u.id = rc.user_id LEFT JOIN `case` c on c.id = rc.case_id WHERE ( rc.user_id != c.user_id AND c.user_id = '25' OR (rc.type = 'conduct' AND rc.ref_id = '25') ) AND c.locked = 'N' AND rc.private != 'Y' AND seen = 'false' ORDER BY rc.datecreated DESC; 首先加入select count(*) as cnt FROM ( SELECT count(*) FROM `user` u LEFT JOIN `recent_changes` rc on u.id = rc.user_id LEFT JOIN `case` c on c.id = rc.case_id WHERE rc.user_id != c.user_id AND c.user_id = '25' AND c.locked = 'N' AND rc.private != 'Y' AND seen = 'false' UNION ALL SELECT count(*) as cnt FROM `user` u LEFT JOIN `recent_changes` rc on u.id = rc.user_id LEFT JOIN `case` c on c.id = rc.case_id WHERE rc.type = 'conduct' AND rc.ref_id = '25' AND c.locked = 'N' AND rc.private != 'Y' AND seen = 'false') x 专栏' rc.user_id'在' where子句'。

新查询:

rc

删除" ORDER BY"虽然我现在更了解它的性能影响,但似乎并没有增加新的连接顺序查询。

使用UNION并不是更快,但单独运行每个选择指出第一个SELECT只需要.3s,其中第二个选择超过4s:

EXPLAIN SELECT count(*) FROM `user` u  LEFT JOIN `recent_changes` rc on u.id = rc.user_id  LEFT JOIN `case` c on c.id = rc.case_id  WHERE rc.user_id != c.user_id AND c.user_id = '25' AND c.locked = 'N'  AND rc.private != 'Y'  AND seen = 'false';

我认为recent_changes +----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+ | 1 | SIMPLE | c | ref | PRIMARY,user_user_id_idx | user_user_id_idx | 5 | const | 383 | Using where | | 1 | SIMPLE | rc | ref | recent_changes_user_id_idx,idx_recent_changes_user_case,idx_recent_changes_case | idx_recent_changes_case | 5 | hsaedmp_jason.c.id | 20 | Using where | | 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | hsaedmp_jason.rc.user_id | 1 | Using index | +----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+ 表格根据EXPLAIN没有必要的索引:

EXPLAIN SELECT count(*) as cnt FROM `user` u  LEFT JOIN `recent_changes` rc on u.id = rc.user_id  LEFT JOIN `case` c on c.id = rc.case_id  WHERE rc.type = 'conduct' AND rc.ref_id = '25' AND c.locked = 'N'  AND rc.private != 'Y'  AND seen = 'false';

运行于< ,5S

+----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+
| id | select_type | table | type   | possible_keys                                                                   | key                     | key_len | ref                      | rows | Extra       |
+----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+
|  1 | SIMPLE      | c     | ALL    | PRIMARY                                                                         | NULL                    | NULL    | NULL                     | 3797 | Using where |
|  1 | SIMPLE      | rc    | ref    | recent_changes_user_id_idx,idx_recent_changes_user_case,idx_recent_changes_case | idx_recent_changes_case | 5       | hsaedmp_jason.c.id       |   20 | Using where |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                                                         | PRIMARY                 | 4       | hsaedmp_jason.rc.user_id |    1 | Using index |
+----+-------------+-------+--------+---------------------------------------------------------------------------------+-------------------------+---------+--------------------------+------+-------------+

在>中运行4S

case

Key = NULL,这不好。

recent_changes

我很困惑,EXPLAIN显示ref_id表没有使用密钥,但似乎+----+-------------+-------+------------+--------+---------------------------------------------------------------------------------------------------------------------------------- ---+------------------------+---------+--------------------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------------------------------------------------------------------------------------------------------------------------- ---+------------------------+---------+--------------------------+------+----------+-------------+ | 1 | SIMPLE | rc | NULL | ref | recent_changes_user_id_idx,idx_recent_changes_user_case,idx_recent_changes_case,idx_recent_changes_case_date,idx_recent_changes_r ef | idx_recent_changes_ref | 5 | const | 2096 | 3.12 | Using where | | 1 | SIMPLE | u | NULL | eq_ref | PRIMARY | PRIMARY | 4 | hsaedmp_jason.rc.user_id | 1 | 100.00 | Using index | | 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | hsaedmp_jason.rc.case_id | 1 | 50.00 | Using where | +----+-------------+-------+------------+--------+---------------------------------------------------------------------------------------------------------------------------------- ---+------------------------+---------+--------------------------+------+----------+-------------+ 表是需要在recent_changes表上的表{ {1}}栏?

以下是该指数的解释,这里看起来好多了,但我还没能在生产中测试它。

select count(*) as num
FROM (
(
SELECT rc1.*
FROM `user` u1 
LEFT JOIN `recent_changes` rc1 on u1.id = rc1.user_id 
LEFT JOIN `case` c1 on c1.id = rc1.case_id 
WHERE 
(rc1.user_id != c1.user_id AND c1.user_id = '1')
AND c1.locked = 'Y'
AND rc1.private != 'Y' 
AND seen = 'false'
ORDER BY rc1.datecreated DESC
)
UNION
(
SELECT rc.*
FROM `user` u 
LEFT JOIN `recent_changes` rc on u.id = rc.user_id 
LEFT JOIN `case` c on c.id = rc.case_id 
WHERE
(rc.type = 'conduct' AND rc.ref_id = '1')
AND c.locked = 'Y'
AND rc.private != 'Y' 
AND seen = 'false'
ORDER BY rc.datecreated DESC
)
) x;

更新

我使用UNION语句重新编写了查询,更改了JOIN顺序,并在ALTER TABLE recent_changes ADD INDEX idx_recent_changes_notification (type, ref_id, private, seen, user_id); 表上添加了复合索引,使查询响应时间达到< 10ms。

以下是使用UNION语句的新查询。

d

我根据我需要的最终查询创建了索引。

game_deck

感谢大家的投入!

1 个答案:

答案 0 :(得分:0)

较小的表应放在join子句的第一个。 这取决于表中的记录数。我认为你的用户表是最小的用户表。所以把它放在第一位。似乎'rc'表是最大的一个。你应该把它放在加入的最后。

这是一个例子。

SELECT count(*) as cnt
FROM `user` u 
LEFT JOIN `case` c on c.id = rc.case_id 
LEFT JOIN `recent_changes` on u.id = rc.user_id 
WHERE 
(
    rc.user_id != c.user_id AND c.user_id = '25'
    OR
    (rc.type = 'conduct' AND rc.ref_id = '25')
)
AND c.locked = 'N'  AND rc.private != 'Y' 
AND seen = 'false'
ORDER BY rc.datecreated DESC;

另外,请参阅下面的帖子。这是mssql的事情,但几乎所有的DBMS都有相同的观点

https://www.mssqltips.com/sqlservertutorial/3201/how-join-order-can-affect-the-query-plan/

<强>更新

我审核了你的问题,发现了另一个嫌疑人,这是关于order by子句。 从查询返回的行数很多,“order by”的时间成本会急剧增加。这是我经验中经常遇到的问题。您是否尝试删除order by子句?它快得多吗?

请参阅Why is this INNER JOIN/ORDER BY mysql query so slow?