由于sql
子句中的内部查询,我跟踪了一些复杂的where
查询,该查询具有可怕的性能,“当然”。在某些情况下,它需要一分钟。有人知道如何重写这个查询以获得更好的性能吗?
查询:
SELECT DISTINCT t.id as taskId, t.name as taskName,
t.startdate as taskStartDate, t.enddate as taskEndDate,
t.proj_id as taskProjectId
FROM PROJECT p, EMPL_PROJ ep, TASK t, TIMERECORD tr
WHERE
ep.empl_id = ? AND
ep.proj_id = p.id AND
ep.proj_id = t.proj_id AND
((p.startdate IS NULL AND p.enddate IS NULL) OR
(p.startdate IS NULL AND p.enddate >= ?) OR
(p.enddate IS NULL AND p.startdate <= ? + INTERVAL 6 DAY) OR
(p.startdate <= ? + INTERVAL 6 DAY AND p.enddate >= ?) ) AND
((t.startdate IS NULL AND t.enddate IS NULL) OR
(t.startdate IS NULL AND t.enddate >= ?) OR
(t.enddate IS NULL AND t.startdate <= ? + INTERVAL 6 DAY) OR
(t.startdate <= ? + INTERVAL 6 DAY AND t.enddate >= ?)) AND
(
(ep.empl_id = tr.empl_id AND
ep.proj_id = tr.proj_id AND
t.id = tr.task_id AND tr.day <= ? + INTERVAL 7 DAY AND
tr.day >= ? + INTERVAL -14 DAY
) OR
(
(SELECT count(*)
FROM TIMERECORD tr2
WHERE
tr2.empl_id=ep.empl_id AND
tr2.proj_id=p.id AND tr2.day <= ? + INTERVAL 7 DAY AND
tr2.day >= ? + INTERVAL -14 DAY) <= 0
)
)
我正在使用mysql服务器5.1.40。
编辑(2): 随着评论和回答,我来到这个查询,在一秒钟内执行(差不多一分钟就来了!)
SELECT DISTINCT t.id as taskId, t.name as taskName,
t.startdate as taskStartDate, t.enddate as taskEndDate,
t.proj_id as taskProjectId
FROM (PROJECT p INNER JOIN EMPL_PROJ ep ON ep.proj_id = p.id)
INNER JOIN TASK t ON p.id=t.proj_id
INNER JOIN TIMERECORD tr ON tr.empl_id=ep.empl_id AND tr.proj_id=ep.proj_id
AND tr.task_id=t.id
WHERE
ep.empl_id = ? AND
((p.startdate IS NULL AND p.enddate IS NULL) OR
(p.startdate IS NULL AND p.enddate >= ?) OR
(p.enddate IS NULL AND p.startdate <= ? + INTERVAL 6 DAY) OR
(p.startdate <= ? + INTERVAL 6 DAY AND p.enddate >= ?) ) AND
((t.startdate IS NULL AND t.enddate IS NULL) OR
(t.startdate IS NULL AND t.enddate >= ?) OR
(t.enddate IS NULL AND t.startdate <= ? + INTERVAL 6 DAY) OR
(t.startdate <= ? + INTERVAL 6 DAY AND t.enddate >= ?)) AND
(
(
tr.day <= ? + INTERVAL 7 DAY AND
tr.day >= ? + INTERVAL -14 DAY
) OR
(
NOT EXISTS(SELECT *
FROM TIMERECORD tr2 INNER JOIN EMPL_PROJ ON tr2.empl_id=EMPL_PROJ.empl_id
INNER JOIN PROJECT ON PROJECT.id=tr2.proj_id
WHERE
tr2.day BETWEEN ? + INTERVAL -14 DAY AND ? + INTERVAL 7 DAY)
)
)
ORDER BY p.id, t.id
最大的贡献是建议NOT EXISTS
方法(我标记为正确)的答案以及不混合explicit
和implicit JOIN
的评论。
谢谢大家!
答案 0 :(得分:2)
当你似乎只需要一个NOT EXISTS ......
时,你正在使用COUNT(*)(
(SELECT count(*)
FROM TIMERECORD tr2
WHERE
tr2.empl_id=ep.empl_id AND
tr2.proj_id=p.id AND tr2.day <= ? + INTERVAL 7 DAY AND
tr2.day >= ? + INTERVAL -14 DAY) <= 0
)
替换为
(
NOT EXISTS(SELECT *
FROM TIMERECORD tr2
WHERE
tr2.empl_id=ep.empl_id AND
tr2.proj_id=p.id AND tr2.day <= ? + INTERVAL 7 DAY AND
tr2.day >= ? + INTERVAL -14 DAY)
)
现在,如果TIMERECORD确实存在,那么where子句的一部分将短路为FALSE(NOT TRUE),而不必计算每个TIMERECORD。
答案 1 :(得分:0)
摆脱子查询。
(1)。使用所有empl_ids&amp;的empl_id,proj_id,count(*)列分别计算子查询。 proj_ids,其中一天落在所需范围内。这是一个简单的查询组。
select empl_id,proj_id,count(*) as ct from TIMERECORD
where day between (? + INTERVAL -14 DAY) and (? + INTERVAL 7 DAY)
group by empl_id,proj_id;
调用此结果集B
(2)。正如您现在所做的那样计算剩余的查询。调用此结果集A
(3)。使用在A&amp; A中常见的列empl_id,proj_id进行左外连接B.乙
然后在where子句中可以检查B.ct列中的值,对于给定时间范围内未在TIMERECORD表中找到任何条目的所有empl_id,proj_id组合,它将为null。
实际上你甚至不需要数(*),因为你不会对实际数量感到困扰。但是,让我不要过于复杂化它。