我有一个很棒的查询 - 但是 - 加载大约需要10分钟。哪个是疯了。而且我希望它比现在运行得更快。
我想知道是否有任何提示可以优化我的查询以使其运行更快?
select DISTINCT
c.PaperID,
cdd.CodesF,
c.PageCount,
prr.projectname,
u.firstname + ' ' + u.lastname as Name,
ett.EventName,
cast(c.AssignedDate as DATE) [AssignedDate],
cast(ev.EventCompletionDate as DATE) [CompletionDate],
ar.ResultDescription,
a.Editor
from tbl_Papers c
left outer join (select cd.PaperId, count(*) as CodesF
from tbl_PaperCodes cd group by cd.PaperId) cdd
on cdd.PaperId = c.PaperId
left outer join
(SELECT
wfce.PaperEventActionNum,
c.PaperId,
CONVERT(varchar,wfce.ActionDate,101) CompletionDate,
pr.ProjectName,
wfce.ActionUserId,
u.firstname+' '+u.lastname [Editor]
FROM
dbo.tbl_WFPaperEventActions wfce
INNER JOIN dbo.tbl_Papers c ON wfce.PaperId = c.PaperId
INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId
WHERE
wfce.EventId = 204
AND c.Papersource =0
GROUP BY
wfce.PaperEventActionNum,
c.PaperId,
CONVERT(varchar,wfce.ActionDate,101),
pr.ProjectName,
wfce.ActionUserId,
u.firstname+' '+u.lastname
)a ON a.PaperId=c.PaperId,
tbl_Providers p, tbl_Sites s,
tbl_Projects prr, tbl_WFPaperEvents ev,
tbl_Users u, tbl_WFPaperEventTypes ett,
tbl_WFPaperEventActions arr, tbl_WFPaperEventActionResults ar
where s.SiteId = p.SiteId
and p.ProviderId = c.ProviderId
and s.ProjectId = prr.ProjectId
and ev.PaperId = c.PaperId
and ev.EventCreateUserId = u.UserId
and ev.EventCompletionDate >= dateadd(day,datediff(day,1,GETDATE()),0)
and ev.EventCompletionDate < dateadd(day,datediff(day,0,GETDATE()),0)
and ev.EventStatusId = 3
and ev.EventId in (201, 203)
and c.Papersource =0--Offshore
and ev.EventId=ett.EventID
and arr.PaperId=c.PaperId
and arr.EventId=ev.EventId
and arr.EventId=ar.EventID
and arr.ActionResultId=ar.ResultID
and arr.ActionResultId in (1,2,3,4)
order by paperid, u.FirstName + ' ' + u.LastName
答案 0 :(得分:1)
您需要仔细查看此查询的每一部分并问自己,是否需要?
使用别名a。
获取子查询它连接了6个表,但是如果跟踪到最终的select子句,则只能从该别名提供[Editor]。所以你需要6个表来到编辑器吗?不,你其实不需要2 tbl_WFPaperEventActions
和tbl_Users
。此外,此子查询按6个项目分组,包括日期,但其中3个项目未在整个查询中的任何其他位置使用 - 那么为什么要在分组中包含这些项目?这允许我们删除3个连接表。
在剩余的3个分组项目中,可以替换另外1个以避免tbl_WFPaperEventActions
和tbl_Papers
之间的连接,因为连接条件是“wfce.PaperId = c.PaperId”,我们只需要是按wfce.PaperId
而不是c.PaperId
最后,我们对字段wfce.PaperEventActionNum
感兴趣,这是由子查询提供但未在较大的查询中使用?为什么提供该字段是不是没有使用?事实证明它应该用于完成连接。子查询别名需要加入PaperEventActionNum
和PaperId
的外部查询。顺便说一下,还需要整个子查询按下连接结构,以符合ANSI连接语法规则。
永远不要“混合”ANSI连接语法与加入“老式方式”
这确实是一场灾难。
下面我已“开始”对您的查询进行一些修改,但我无法完成它,因为我无法测试它的任何部分;我根本不知道你的数据模型。
就个人而言,我会从头开始重新启动此查询,开始精简并逐项添加以确保其保持精益。
SELECT DISTINCT /* distinct isn't a good solution here */
c.PaperID
, cdd.CodesF
, c.PageCount
, prr.projectname
, u.firstname + ' ' + u.lastname AS Name
, ett.EventName
, CAST(c.AssignedDate AS date) [AssignedDate]
, CAST(ev.EventCompletionDate AS date) [CompletionDate]
, ar.ResultDescription
, a.Editor
FROM tbl_Papers c
LEFT OUTER JOIN ( -- can this be an inner join instead?
SELECT
cd.PaperId
, COUNT(*) AS CodesF
FROM tbl_PaperCodes cd
GROUP BY
cd.PaperId
) cdd
ON cdd.PaperId = c.PaperId
INNER JOIN tbl_Providers p ON c.ProviderId = p.ProviderId
INNER JOIN tbl_Sites s ON p.SiteId = s.SiteId
INNER JOIN tbl_Projects prr ON s.ProjectId = prr.ProjectId
INNER JOIN tbl_WFPaperEvents ev ON c.PaperId = ev.PaperId
INNER JOIN tbl_Users u ON ev.EventCreateUserId = u.UserId
INNER JOIN tbl_WFPaperEventTypes ett ON ev.EventId = ett.EventID
INNER JOIN tbl_WFPaperEventActions arr ON c.PaperId = arr.PaperId
AND ev.EventId = arr.EventId
INNER JOIN tbl_WFPaperEventActionResults ar ON arr.EventId = ar.EventID
AND arr.ActionResultId = ar.ResultID
AND arr.ActionResultId IN (1, 2, 3, 4)
LEFT OUTER JOIN (
SELECT
wfce.PaperEventActionNum
, wfce.PaperId
--, c.PaperId
--, CONVERT(varchar, wfce.ActionDate, 101) CompletionDate -- cast to date here
--, pr.ProjectName
--, wfce.ActionUserId
, u.firstname + ' ' + u.lastname [Editor]
FROM dbo.tbl_WFPaperEventActions wfce
--INNER JOIN dbo.tbl_Papers c ON wfce.PaperId = c.PaperId
--INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
--INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
--INNER JOIN tbl_Projects pr ON s.ProjectId = pr.ProjectId
tbl_Users INNER JOIN u ON wfce.ActionUserId = u.UserId
WHERE wfce.EventId = 204
AND c.Papersource = 0
GROUP BY
wfce.PaperEventActionNum
, wfce.PaperId
--, c.PaperId
--, CONVERT(varchar, wfce.ActionDate, 101)
--, pr.ProjectName
--, wfce.ActionUserId
, u.firstname + ' ' + u.lastname
) a
ON c.PaperId = a.PaperId AND arr.PaperEventActionNum = a.PaperEventActionNum
WHERE ev.EventCompletionDate >= DATEADD(DAY, DATEDIFF(DAY, 1, GETDATE()), 0)
AND ev.EventCompletionDate < DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)
AND ev.EventStatusId = 3
AND ev.EventId IN (201, 203)
AND c.Papersource = 0--Offshore
ORDER BY
paperid, u.FirstName + ' ' + u.LastName
我真的很讨厌DISTINCT。这很讨厌。它不能解决问题,它只是隐藏它们;并减慢一切以隐藏。
使用与查询复杂性成反比的distinct:
答案 1 :(得分:0)
检查您拥有join
,where
和group by
子句的字段数量。每个非索引字段都会对性能产生负面影响。
GROUP BY中的计算字段可能很痛苦,也可能是DISTINCT(特别是如果它们没有编入索引)。例如。对u.ID
而不是u.firstname+' '+u.lastname
或pr.ProjectId
i / o pr.ProjectName
之类的内容进行分组可以加快速度(如果需要,可以根据其他条件对输出进行排序)。< / p>
您真的需要使用left join
吗?即你想在连接的另一边保持桌子,即使在另一边没有匹配的情况下吗?如果没有,请将其替换为inner join
。
这里有各种小改进,例如:
(假设Papersource和EventId是索引):
FROM
(SELECT * FROM dbo.tbl_WFPaperEventActions WHERE EventId = 204) wfce
INNER JOIN
(SELECT * FROM dbo.tbl_Papers WHERE Papersource = 0) c
ON wfce.PaperId = c.PaperId
INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId
而不是
FROM
dbo.tbl_WFPaperEventActions wfce
INNER JOIN dbo.tbl_Papers c ON wfce.PaperId = c.PaperId
INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId
WHERE
wfce.EventId = 204
AND c.Papersource =0
或(如果我理解了这个想法):
and ev.EventCompletionDate BETWEEN (
dateadd(day, -1, GETDATE()) and dateadd(ns, -1, GETDATE())
而不是:
and ev.EventCompletionDate >= dateadd(day,datediff(day,1,GETDATE()),0)
and ev.EventCompletionDate < dateadd(day,datediff(day,0,GETDATE()),0)
一般情况下:问问自己这个查询究竟想要实现什么,数据的哪些部分与之相关,有多少源表可以用它们的片段替换(这可以使JOIN更快地工作),并尝试在JOIN和WHERE子句的使用方面保持一致。