我有一张900k +记录的表
运行此查询需要一分钟或更长时间:
SELECT
t.user_id,
SUM(t.direction = "i") AS 'num_in',
SUM(t.direction = "o") AS 'num_out'
FROM tbl_user_reports t
WHERE t.bound_time BETWEEN '2011-02-01' AND '2011-02-28'
GROUP BY t.user_id
HAVING t.user_id IS NOT NULL
ORDER BY num_in DESC
LIMIT 10;
你能告诉我如何更快地查询结果吗?
- 更多信息 - 结构:
id int(11) unsigned NOT NULL
subscriber varchar(255) NULL
user_id int(11) unsigned NULL
carrier_id int(11) unsigned NOT NULL
pool_id int(11) unsigned NOT NULL
service_id int(11) unsigned NOT NULL
persona_id int(11) unsigned NULL
inbound_id int(11) unsigned NULL
outbound_id int(11) unsigned NULL
bound_time datetime NOT NULL
direction varchar(1) NOT NULL
索引:
bound_timebound_time
FK_tbl_user_reportspersona_id
FK_tbl_user_reports_messageinbound_id
FK_tbl_user_reports_serviceservice_id
FK_tbl_user_reports_poolpool_id
FK_tbl_user_reports_useruser_id
FK_tbl_user_reports_carriercarrier_id
FK_tbl_user_reports_subscribersubscriber
FK_tbl_user_reports_outboundoutbound_id
directiondirection
答案 0 :(得分:2)
您可能想在
上尝试复合索引(bound_time, user_id, direction)
包含您需要的所有字段,并且可以非常有效地缩小日期范围。
答案 1 :(得分:2)
如果可能,请重新设计您的报告表,以便更好地利用您的innodb群集主键索引。
以下是我的意思的简化示例:
500万行 32K用户 日期范围内的126K记录
冷运行时(在mysqld重启后)= 0.13秒
create table user_reports
(
bound_time datetime not null,
user_id int unsigned not null,
id int unsigned not null,
direction tinyint unsigned not null default 0,
primary key (bound_time, user_id, id) -- clustered composite PK
)
engine=innodb;
select count(*) as counter from user_reports;
+---------+
| counter |
+---------+
| 5000000 |
+---------+
select count(distinct(user_id)) as counter from user_reports;
+---------+
| counter |
+---------+
| 32000 |
+---------+
select count(*) as counter from user_reports
where bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00';
+---------+
| counter |
+---------+
| 126721 |
+---------+
select
t.user_id,
sum(t.direction = 1) AS num_in,
sum(t.direction = 0) AS num_out
from
user_reports t
where
t.bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00' and
t.user_id is not null
group by
t.user_id
order by
direction desc
limit 10;
+---------+--------+---------+
| user_id | num_in | num_out |
+---------+--------+---------+
| 17397 | 1 | 1 |
| 14729 | 2 | 1 |
| 20094 | 4 | 1 |
| 19343 | 7 | 1 |
| 24804 | 1 | 2 |
| 14714 | 3 | 2 |
| 2662 | 4 | 3 |
| 16360 | 2 | 3 |
| 21288 | 2 | 3 |
| 12800 | 6 | 2 |
+---------+--------+---------+
10 rows in set (0.13 sec)
explain
select
t.user_id,
sum(t.direction = 1) AS num_in,
sum(t.direction = 0) AS num_out
from
user_reports t
where
t.bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00' and
t.user_id is not null
group by
t.user_id
order by
direction desc
limit 10;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref |rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 8 | NULL |255270 | Using where; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
1 row in set (0.00 sec)
希望您觉得这有用:)
答案 2 :(得分:1)
正如Thilo所说,添加索引,而不是tbl_user_reports t
使用tbl_user_reports AS t
,我会将HAVING语句移到WHERE以减少操作量。
WHERE t.user_id IS NOT NULL AND t.bound_time BETWEEN '2011-02-01' AND '2011-02-28'
<强>更新强> 出于实验目的,您可以尝试使用like而不是
t.bound_time LIKE '2011-02%'