我有一个包含11列和大约500万条记录的InnoDB表,在其中我使用查询来查找具有最高总和的前10条记录。表模式如下。
id (int 11) (primary key)
activity_id(varchar 250)
activity_type (varchar 10)
advertised_time (timestamp)
advertised_train_ident(int 11)
technical_train_ident(int 11)
location_signature(varchar 10)
time_at_location(timestamp)
information_owner(varchar 100)
created_at(timestamp)
updated_at(timestamp)
表中存在的索引是
id - primary key
location_signature,activity_type, advertised_time - composite index (name is search)
我正在使用以下查询从上表中提取记录,完成执行需要10到12秒的时间。
SELECT location_signature, activity_type,
SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0 THEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) else 0 END) as delay_time,
count(id) as total_train_count,
SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0 THEN 1 ELSE 0 END) as delayed_train_count
from `train_announcements`
where `advertised_time` >= '2019-04-01 10:00:00' and `advertised_time` <= '2019-04-30 10:00:00'
group by `location_signature`, `activity_type`
order by `delay_time` desc
limit 10 offset 0;
此查询的Explain语句如下
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | train_announcements | index | search | search | 84 | NULL | 4910024| Using where; Using temporary; Using filesort |
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
请注意,由于字段location_signature包含特殊字符,因此该表的排序规则为utf8mb4_unicode_ci
。
如果有人可以提出任何解决方法来提高此查询的性能,那将是很好的。预先感谢。
答案 0 :(得分:3)
查看索引,确保您的advertised_time位于左上方
并且可能对添加time_at_location敌人很有用,以避免访问数据表并使用索引中的数据
表train_announcements的索引
列(广告时间,位置签名,活动类型,时间所在位置)
SELECT location_signature
, activity_type
, SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0
THEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location)
ELSE 0 END) as delay_time
, count(id) as total_train_count
, SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0
THEN 1
ELSE 0 END) as delayed_train_count
from `train_announcements`
where `advertised_time` BETWEEN '2019-04-01 10:00:00' and '2019-04-30 10:00:00'
group by `location_signature`, `activity_type`
order by `delay_time` desc
limit 10 offset 0;
,如果您没有id为null的值,请尝试使用count(*)代替count(id)
SELECT location_signature
, activity_type
, SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0
THEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location)
ELSE 0 END) as delay_time
, count(*) as total_train_count
, SUM(CASE WHEN TIMESTAMPDIFF(MINUTE,advertised_time, time_at_location) > 0
THEN 1
ELSE 0 END) as delayed_train_count
from `train_announcements`
where `advertised_time` BETWEEN '2019-04-01 10:00:00' and '2019-04-30 10:00:00'
group by `location_signature`, `activity_type`
order by `delay_time` desc
limit 10 offset 0;
或者如果您确实还需要ID,请尝试将此列添加到复合索引
(advertised_time, location_signature, activity_type, time_at_location, id )
答案 1 :(得分:0)
建立并维护摘要表。例如,每天都有小计。然后,“报告”将针对这个小得多的表,因此会更快。