SELECT
s.ColID1
,s.ColIdentification2
,s.StatusColumn
,(SELECT
MAX(pd.DateColumn)
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'TextFiles')
AS maxDate
,(SELECT TOP 1
u.Title
FROM DocumentTable pd
LEFT OUTER JOIN [User] u
ON u.UserId = pd.UserId
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'Text Files'
ORDER BY pd.DateColumn DESC)
AS Name1
,(SELECT TOP 1
pd.DocumentType
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'Text Files'
ORDER BY pd.DateColumn DESC)
, (SELECT TOP 1
pd.TypeofFile
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'Text Files'
ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1
pd.Region
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'Text Files'
ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1
pd.Agency
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile = 'Text Files'
ORDER BY pd.DateColumn DESC)
FROM Service s (NOLOCK)
--left outer join DocumentTable pd1 (NOLOCK)
--on pd1.ColIdentification2 = s.ColIdentification2
WHERE s.IsPresent = 1
--AND pd1.ColIdentification2 = s.ColIdentification2
AND s.StatusColumn IN ('Val1', 'Val3')
AND NOT EXISTS (SELECT
pd.DocumentTableId
FROM DocumentTable pd
WHERE pd.IsPresent = 1
AND pd.ColIdentification2 = s.ColIdentification2
AND pd.TypeofFile IN ('DC1', 'DC2'))
AND NOT EXISTS (SELECT
utds.ID
FROM utds
WHERE utds.Service_x0020_ID1_Id = s.ColID1
AND utds.Type IN ('DC1', 'DC2'))
ORDER BY s.ColID1
我正在尝试优化这个sql。由于许多子查询,它需要很长时间。此查询运行时间超过10分钟,我正在努力改进它。无论如何要避免子查询。我尝试在表之间使用Left Outer join,但我认为由于DocumentTable中ColID1的数据重复,我没有得到正确的数据
答案 0 :(得分:0)
很难调整没有统计数据和执行计划的查询,并尝试和错误。
我认为,您可以通过将子查询转换为加入来使其更好。因此,尝试消除子查询。
您可以使用以下查询删除4个联接
SELECT s.ColID1
, s.ColIdentification2
, s.StatusColumn
, pd.DocumentType, pd.TypeofFile, pd.Region, pd.TypeofFile, Region
from [Service] s
outer apply (select top 1 DocumentType, TypeofFile, Region, TypeofFile, Region
from DocumentTable
where IsPresent = 1 and TypeofFile = 'Text Files'
and ColIdentification2 = s.ColIdentification2
order by DateColumn desc) pd
如果有帮助,请尝试使用相同的方法。
还要确保两个表中的ColIdentification2字段都有索引。
答案 1 :(得分:0)
Flicker非常重视确保您的公共列(如ColIdentification2)被编入索引。我还想验证您在DocumentTable.DateColumn
上有索引。
无论如何......
在你的查询中,事情有点忙,让我们重新格式化一下并拍摄一张大图片"看看它:
SELECT
s.ColID1
,s.ColIdentification2
,s.StatusColumn
,(SELECT TOP 1 u.Title FROM DocumentTable pd LEFT OUTER JOIN [User] u ON u.UserId = pd.UserId WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) AS Name1
,(SELECT MAX(pd.DateColumn) FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'TextFiles') AS maxDate
,(SELECT TOP 1 pd.DocumentType FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.TypeofFile FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.Region FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.Agency FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
FROM Service s (NOLOCK)
WHERE s.IsPresent = 1
AND s.StatusColumn IN ('Val1', 'Val3')
AND NOT EXISTS (SELECT utds.ID FROM utds WHERE utds.Service_x0020_ID1_Id = s.ColID1 AND utds.Type IN ('DC1', 'DC2'))
ORDER BY s.ColID1
因此,以下列看起来最终都来自DocumentTable
pd中的SAME行:
pd.DateColumn
pd.DocumentType
pd.TypeofFile
pd.Region
pd.Agency
note: For pd.DateColumn, your use of max(pd.DateColumn) has the result same
the sub-select style you're using in the other pd.* columns:
SELECT TOP 1 pd.DateColumn from ...BLAH BLAH BLAH... order by pd.DateColumn DESC
Also your pd.DateColumn's subselect has a where clause checking for 'TextFiles'
instead of 'Text Files' that the other pd.* columns are using, should they all
be 'Text Files'? (Note the extra embedded space in 'TextFiles' vs 'Text Files')
而不是为pd运行相同的子查询逻辑5次, 让我们将它推入左连接并尝试一次...
这是完全未经测试的代码btw,我希望它有效: - )
SELECT
s.ColID1
, s.ColIdentification2
, s.StatusColumn
/* If we get a stable row for PD pulling u.Title from User becomes easier... */
, (select u.Title from User u where on u.UserId = pd.UserId) as userTitle
, pd.DateColumn
, pd.DocumentType
, pd.TypeofFile
, pd.Region
, pd.Agency
FROM Service s (NOLOCK)
left join DocumentTable pd
on pd.IsPresent = 1
and pd.ColIdentification2 = s.ColIdentification2
and pd.TypeofFile = 'Text Files'
/* This next condition avoids having to do the ORDER BY pd.DateColumnDESC
* The idea is for sqlserver to consider all potential matching pd records
* but ignore any that aren't the largest date.
*/
and not exists( select 1 from DocumentTable pd2
where pd2.IsPresent = pd1.IsPresent
and pd2.ColIdentification2 = pd.ColIdentification2
and pd2.TypeofFile = pd.TypeofFile
and pd2.DateColumn > pd.DateColumn)
/* may as well add the "no DC1 & DC2" clause here... */
and not exists (select 1 FROM DocumentTable pd3
where pd2.IsPresent = pd1.IsPresent
and pd2.ColIdentification2 = pd.ColIdentification2
and pd2.TypeofFile in ( 'DC1', 'DC2')
and pd2.DateColumn > pd.DateColumn)
WHERE s.IsPresent = 1
AND s.StatusColumn IN ('Val1', 'Val3')
AND NOT EXISTS (
SELECT 1 FROM utds
WHERE utds.Service_x0020_ID1_Id = s.ColID1
AND utds.Type IN ('DC1', 'DC2') )
ORDER BY s.ColID1
一些结束的想法:
我喜欢缩进复杂的WHERE
条款,让我更容易缠头
围绕逻辑。
要考虑查询的行为,请使用主表'正在做:
select * FROM Service s
对于我们从“'我们想找到(最多)一个合适的' pd'记录。
这里"合适的"表示pd.ColIdentification2 = s.colIdentification
之类的常见列,等等。
细微之处在于:
AND NOT EXISTS (SELECT 1 FROM DocumentTable PD2 ....WHERE PD2.DATECOLUMN > PD.DATECOLUMN).
这里的一个加速优势是我们真的不关心ORDER BY
,我们只是想确保我们在pd中有最新的行(我们使用not-exists用pd2将任何旧的pd记录从运行中踢出来。
我认为这比ORDER BY
更快的原因是SQL Server引擎不需要进行索引遍历来处理TOP 1
上的ORDER BY DATECOLUMN DESC
& #34 ;;一个聪明的优化器可能会想出来并且只是跳到DATECOLUMN的最大索引......但这是一个很大的可能所以我希望这种方法总体上更快。)
你会注意到一个类似的技巧,阻止阻止任何有DC1或DC2的PD记录。
在原始查询中,我将该部分(最后,在主WHERE子句中)读取为:"即使给定的PD记录在各方面都是完美的(完全匹配' s&#39) ;并且是最新的PD记录),如果任何PD / S匹配存在' DC1'或者' DC2' (无论日期如何)然后我们想要发出所有PD / S记录。