没有全表扫描的优先级/级联日期时间

时间:2017-12-08 05:40:29

标签: mysql query-optimization

SELECT IF(priority_date, priority_date, created_at) as created_at
FROM table
WHERE IF(priority_date , priority_date , created_at) 
    BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59';

执行此查询的最佳方式是什么?性能明智?

我有一个相当大的表,有两个日期时间。 created_atpriority_date

priority_date并不总是存在,但如果确实存在,则应该是查询的内容,否则它会回退到created_at。始终在创建行时生成created_at。上述查询导致(几乎)全表扫描。

初步查询的解释计划:

+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
| id   | select_type | table           | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
|    1 | SIMPLE      | table           | ALL  | NULL          | NULL | NULL    | NULL | 444877 | Using where |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+

我还应该注意,priority_datecreated_at可能不一定都在单行的时间范围内。所以做一些事情:

WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'

如果priority_date2017-10-04 23:10:43created_at2017-10-10 01:23:45

,则可能会产生错误结果

我表的当前行:582739

WHERE priority_date BETWEEN...的计数:3908

WHERE created_at BETWEEN...的计数:3437

在WHERE BETWEEN中查询的一个列的示例说明:

+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
| id   | select_type | table           | type  | possible_keys                    | key                              | key_len | ref  | rows | Extra                 |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
|    1 | SIMPLE      | table           | range | table_created_at_index           | table_created_at_index           | 5       | NULL | 3436 | Using index condition |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+

显然IF不是最有效的。列已编制索引,各行的说明与解释计划中的行计数相匹配。如何在没有严重性能损失的情况下利用优先级/后备查询?

修改

我能够想象的最好的(但是WOW,那就是那种冗长和复制/粘贴的感觉)

SELECT IF(priority_date, priority_date, created_at) as created_at, priority_date
FROM table 
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
    OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
HAVING ((priority_date AND priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59')
    OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59');

及其解释计划:

+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
| id   | select_type | table           | type        | possible_keys                                                         | key                                                                   | key_len | ref  | rows | Extra                                                                                                |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
|    1 | SIMPLE      | table           | index_merge | table_priority_date_index,table_created_at_index                      | table_priority_date_index,table_created_at_index                      | 6,5     | NULL | 7343 | Using sort_union(table_priority_date_index,table_created_at_index); Using where                      |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+

3 个答案:

答案 0 :(得分:2)

SELECT priority_date as created_at
FROM table
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'

UNION ALL

SELECT created_at
FROM table
WHERE created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
 AND priority_date IS NULL;

对于此查询的前半部分,您需要一个以priority_date开头的索引,而对于下一半,您需要一个(created_at, priority_date)的索引。

前半部分自然不匹配priority_date为NULL的任何行。

后半部分将在created_at上执行范围条件,然后在匹配行的子集中,进一步测试priority_date为NULL。这可以通过索引条件下推来完成。

答案 1 :(得分:2)

首先你需要一个复合索引(priority_date,created_at),然后你可以使用这样的查询:

Extra: Using where; Using index
key: priority_created_compound
rows: 2000

在复合索引中首先使用priority_date会产生很大的不同。不需要工会。

用400个结果解释400k行的结果:

{{1}}

答案 2 :(得分:0)

( SELECT  priority_date AS created_at
    FROM  table
    WHERE  priority_date >= '2017-10-10'
      AND  priority_date <  '2017-10-10' + INTERVAL 1 DAY )
UNION  DISTINCT 
( SELECT  created_at
    FROM  table
    WHERE  created_at >= '2017-10-10'
      AND  created_at <  '2017-10-10' + INTERVAL 1 DAY
      AND  priority_date IS NULL )

使用

INDEX(priority_date, created_at)  -- in this order

注意:

  • 这种方式BETWEEN可以更好地适用于DATETIME以外的日期范围,并且可以避免计算闰日等。(这不是性能差异。)
  • 对于每个子查询,一个索引是&#34;覆盖&#34;并且最佳。不需要ICP。
  • 我在DISTINCT上选择UNION - 虽然比ALL慢,但可能更适合您应用的喜好。如果不能重复,或者如果重复没问题,请切换到ALL