我有这两个表:
CREATE TABLE `cpuinfo` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`usagetime` datetime DEFAULT NULL,
`cpuusage` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`),
KEY `idx_usagetime` (`usagetime`),
KEY `idx_usage` (`cpuusage`));
CREATE TABLE `jobinfo` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`starttime` datetime NOT NULL,
`endtime` datetime DEFAULT NULL,
`jobname` text NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`),
KEY `idx-startime` (`starttime`),
KEY `idx-endtime` (`endtime`));
使用此查询:
explain SELECT j.id, j.starttime, j.endtime, j.jobname, c.cpuusage
FROM (SELECT j.id, j.starttime, j.endtime, j.jobname, MAX(c.usagetime) AS usagetime
FROM jobinfo AS j
LEFT JOIN cpuinfo AS c ON c.usagetime <= j.starttime
GROUP BY j.id) AS j
JOIN cpuinfo AS c ON j.usagetime = c.usagetime
ORDER BY j.starttime
运行大约需要10分钟。
for explain命令,我得到了这个输出
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
---------------------------------------------------------------------------
1,PRIMARY,<derived2>,ALL,NULL,NULL,NULL,NULL,4557,"Using filesort"
1,PRIMARY,c,ref,idx_usagetime,idx_usagetime,9,j.usagetime,1,"Using where"
2,DERIVED,j,ALL,NULL,NULL,NULL,NULL,4557,"Using temporary; Using filesort"
2,DERIVED,c,index,idx_usagetime,idx_usagetime,9,NULL,2880,"Using index"
你能给我一些优化这个SQL查询的技巧吗?
这是我的原始帖子:
答案 0 :(得分:0)
您加入的不是比较:
c.usagetime <= j.starttime
这意味着每个使用时间小于作业开始时间的cpu记录将加入作业记录。随着时间的推移,这个查询将变得越来越慢,因为如果它存在,它将加入几个月前的信息。您只对作业开始前的最新条目感兴趣。
如果您确信在作业开始时间的某个时间段内有cpuinfo记录,请将其更改为范围搜索。
c.usagetime between j.starttime and date_sub(j.starttime, interval 5 minute)
这应该会大大加快速度。你可以越小越好。
答案 1 :(得分:0)
你可以尝试这个小技巧:
SELECT j.id, j.starttime, j.endtime, j.jobname, c.cpuusage
FROM
(
SELECT j.id, j.starttime, j.endtime, j.jobname, MAX(c.usagetime) AS usagetime
FROM jobinfo AS j
LEFT JOIN cpuinfo AS c
ON c.usagetime <= j.starttime
WHERE c.usagetime > DATE_ADD(j.starttime, INTERVAL -1 DAY);
GROUP BY j.id
) AS j
JOIN cpuinfo AS c
ON j.usagetime = c.usagetime
ORDER BY j.starttime;
这应该导致服务器只占用o cpuinfo表的一部分,也不占整个或一半。
PS:尝试考虑间隔值,也许5分钟就足够了。
答案 2 :(得分:0)
尝试:
SELECT ji.starttime,
ji.endtime,
ji.jobname,
(SELECT ci.cpuusage
FROM CPUINFO ci
WHERE ci.usagetime <= ji.endtime
ORDER BY ci.usagetime DESC
LIMIT 1) AS cpuusage
FROM JOBINFO ji
这是我5.1.49上的EXPLAIN输出:
id select_type table type possible_keys key key_len ref rows Extra
------------------------------------------------------------------------------------------------
'1', 'PRIMARY', 'ji', 'ALL', NULL, NULL, NULL, NULL, '12', ''
'2', 'DEPENDENT SUBQUERY', 'ci', 'ALL', 'idx_usagetime', NULL, NULL, NULL, '6', 'Using where; Using filesort'