Question

      SELECT
 s.ColID1
,s.ColIdentification2
,s.StatusColumn
,(SELECT
     MAX(pd.DateColumn)
   FROM DocumentTable pd
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'TextFiles')
 AS maxDate
,(SELECT TOP 1
     u.Title
   FROM DocumentTable pd
   LEFT OUTER JOIN [User] u
     ON u.UserId = pd.UserId
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'Text Files'
   ORDER BY pd.DateColumn DESC)
 AS Name1
 ,(SELECT TOP 1
     pd.DocumentType
   FROM DocumentTable pd
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'Text Files'
   ORDER BY pd.DateColumn DESC)
, (SELECT TOP 1
     pd.TypeofFile
   FROM DocumentTable pd
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'Text Files'
   ORDER BY pd.DateColumn DESC)
 ,(SELECT TOP 1
    pd.Region
    FROM DocumentTable pd
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'Text Files'
   ORDER BY pd.DateColumn DESC)
 ,(SELECT TOP 1
    pd.Agency 
    FROM DocumentTable pd
   WHERE pd.IsPresent = 1
   AND pd.ColIdentification2 = s.ColIdentification2
   AND pd.TypeofFile = 'Text Files'
   ORDER BY pd.DateColumn DESC)
FROM Service s (NOLOCK)
--left outer join DocumentTable pd1 (NOLOCK)
--on pd1.ColIdentification2 = s.ColIdentification2
WHERE s.IsPresent = 1
--AND pd1.ColIdentification2 = s.ColIdentification2
AND s.StatusColumn IN ('Val1', 'Val3')
AND NOT EXISTS (SELECT
   pd.DocumentTableId
 FROM DocumentTable pd
 WHERE pd.IsPresent = 1
 AND pd.ColIdentification2 = s.ColIdentification2
 AND pd.TypeofFile IN ('DC1', 'DC2'))
AND NOT EXISTS (SELECT
   utds.ID
 FROM  utds
 WHERE utds.Service_x0020_ID1_Id = s.ColID1
 AND utds.Type IN ('DC1', 'DC2'))
ORDER BY s.ColID1

我正在尝试优化这个sql。由于许多子查询，它需要很长时间。此查询运行时间超过10分钟，我正在努力改进它。无论如何要避免子查询。我尝试在表之间使用Left Outer join，但我认为由于DocumentTable中ColID1的数据重复，我没有得到正确的数据

Answer 1

很难调整没有统计数据和执行计划的查询，并尝试和错误。

我认为，您可以通过将子查询转换为加入来使其更好。因此，尝试消除子查询。

您可以使用以下查询删除4个联接

SELECT s.ColID1
    , s.ColIdentification2
    , s.StatusColumn
    , pd.DocumentType, pd.TypeofFile, pd.Region, pd.TypeofFile, Region
from [Service] s 
    outer apply (select top 1 DocumentType, TypeofFile, Region, TypeofFile, Region
                from DocumentTable
                where IsPresent = 1 and TypeofFile = 'Text Files' 
                    and ColIdentification2 = s.ColIdentification2
                order by DateColumn desc) pd

如果有帮助，请尝试使用相同的方法。

还要确保两个表中的ColIdentification2字段都有索引。

Answer 2

Flicker非常重视确保您的公共列（如ColIdentification2）被编入索引。我还想验证您在DocumentTable.DateColumn上有索引。

无论如何......

在你的查询中，事情有点忙，让我们重新格式化一下并拍摄一张大图片＆＃34;看看它：

SELECT
 s.ColID1
,s.ColIdentification2
,s.StatusColumn
,(SELECT TOP 1 u.Title         FROM DocumentTable pd LEFT OUTER JOIN [User] u ON u.UserId = pd.UserId WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) AS Name1
,(SELECT MAX(pd.DateColumn)    FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'TextFiles') AS maxDate
,(SELECT TOP 1 pd.DocumentType FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.TypeofFile   FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.Region       FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
,(SELECT TOP 1 pd.Agency       FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC)
FROM Service s (NOLOCK)
WHERE s.IsPresent = 1
  AND s.StatusColumn IN ('Val1', 'Val3')
AND NOT EXISTS (SELECT utds.ID FROM  utds WHERE utds.Service_x0020_ID1_Id = s.ColID1 AND utds.Type IN ('DC1', 'DC2'))
ORDER BY s.ColID1

因此，以下列看起来最终都来自DocumentTable pd中的SAME行：

pd.DateColumn
pd.DocumentType 
pd.TypeofFile   
pd.Region       
pd.Agency       

note: For pd.DateColumn, your use of max(pd.DateColumn) has the result same
      the sub-select style you're using in the other pd.* columns:
      SELECT TOP 1 pd.DateColumn from ...BLAH BLAH BLAH... order by pd.DateColumn DESC
Also your pd.DateColumn's subselect has a where clause checking for 'TextFiles'
instead of 'Text Files' that the other pd.* columns are using, should they all
be 'Text Files'?  (Note the extra embedded space in 'TextFiles' vs 'Text Files')

而不是为pd运行相同的子查询逻辑5次，让我们将它推入左连接并尝试一次...

这是完全未经测试的代码btw，我希望它有效： - ）

SELECT
  s.ColID1
, s.ColIdentification2
, s.StatusColumn
/* If we get a stable row for PD pulling u.Title from User becomes easier... */
, (select u.Title from User u where on u.UserId = pd.UserId) as userTitle
, pd.DateColumn
, pd.DocumentType
, pd.TypeofFile
, pd.Region
, pd.Agency
FROM Service s (NOLOCK)
left join DocumentTable pd
       on  pd.IsPresent = 1 
       and pd.ColIdentification2 = s.ColIdentification2
       and pd.TypeofFile = 'Text Files'
       /* This next condition avoids having to do the ORDER BY pd.DateColumnDESC 
        * The idea is for sqlserver to consider all potential matching pd records
        * but ignore any that aren't the largest date.
        */
       and not exists( select 1 from DocumentTable pd2
                       where pd2.IsPresent          = pd1.IsPresent
                         and pd2.ColIdentification2 = pd.ColIdentification2
                         and pd2.TypeofFile         = pd.TypeofFile
                         and pd2.DateColumn         > pd.DateColumn)
       /* may as well add the "no DC1 & DC2" clause here... */
       and not exists (select 1 FROM DocumentTable pd3
                       where pd2.IsPresent          = pd1.IsPresent
                         and pd2.ColIdentification2 = pd.ColIdentification2
                         and pd2.TypeofFile         in ( 'DC1', 'DC2')
                         and pd2.DateColumn         > pd.DateColumn)
WHERE s.IsPresent = 1
  AND s.StatusColumn IN ('Val1', 'Val3')
  AND NOT EXISTS (
     SELECT 1 FROM  utds
     WHERE utds.Service_x0020_ID1_Id = s.ColID1
       AND utds.Type                 IN ('DC1', 'DC2') )
ORDER BY s.ColID1

一些结束的想法：

我喜欢缩进复杂的WHERE条款，让我更容易缠头围绕逻辑。

要考虑查询的行为，请使用主表＆＃39;正在做：

select * FROM Service s

对于我们从“＆＃39;我们想找到（最多）一个合适的＆＃39; pd＆＃39;记录。

这里＆＃34;合适的＆＃34;表示pd.ColIdentification2 = s.colIdentification之类的常见列，等等。

细微之处在于：

AND NOT EXISTS (SELECT 1 FROM DocumentTable PD2 ....WHERE PD2.DATECOLUMN > PD.DATECOLUMN).

这里的一个加速优势是我们真的不关心ORDER BY，我们只是想确保我们在pd中有最新的行（我们使用not-exists用pd2将任何旧的pd记录从运行中踢出来。

我认为这比ORDER BY更快的原因是SQL Server引擎不需要进行索引遍历来处理TOP 1上的ORDER BY DATECOLUMN DESC＆＃34 ;;一个聪明的优化器可能会想出来并且只是跳到DATECOLUMN的最大索引......但这是一个很大的可能所以我希望这种方法总体上更快。）

你会注意到一个类似的技巧，阻止阻止任何有DC1或DC2的PD记录。

在原始查询中，我将该部分（最后，在主WHERE子句中）读取为：＆＃34;即使给定的PD记录在各方面都是完美的（完全匹配＆＃39; s＆＃39） ;并且是最新的PD记录），如果任何PD / S匹配存在＆＃39; DC1＆＃39;或者＆＃39; DC2＆＃39; （无论日期如何）然后我们想要发出所有PD / S记录。

用一定的时间来处理SQL问题

2 个答案: