投影中相关子查询的排序影响

时间:2010-03-30 21:23:45

标签: sql sql-server performance tsql subquery

我注意到SQL Server(在本例中为SQL Server 2008)如何处理select语句中的相关子查询有些意外。我的假设是查询计划不应该受到在select语句的projection子句中写入子查询(或列)的单纯顺序的影响。但是,情况似乎并非如此。

考虑以下两个查询,除了CTE中子查询的排序外,它们是相同的:

--query 1: subquery for Color is second
WITH vw AS
(
 SELECT p.[ID],
  (SELECT TOP(1) [FirstName] FROM [Preference] WHERE p.ID = ID AND [FirstName] IS NOT NULL ORDER BY [LastModified] DESC) [FirstName],
  (SELECT TOP(1) [Color] FROM [Preference] WHERE p.ID = ID AND [Color] IS NOT NULL ORDER BY [LastModified] DESC) [Color]
 FROM Person p
)
SELECT ID, Color, FirstName
FROM vw
WHERE Color = 'Gray';


--query 2: subquery for Color is first
WITH vw AS
(
 SELECT p.[ID],
  (SELECT TOP(1) [Color] FROM [Preference] WHERE p.ID = ID AND [Color] IS NOT NULL ORDER BY [LastModified] DESC) [Color],
  (SELECT TOP(1) [FirstName] FROM [Preference] WHERE p.ID = ID AND [FirstName] IS NOT NULL ORDER BY [LastModified] DESC) [FirstName]
 FROM Person p
)
SELECT ID, Color, FirstName
FROM vw
WHERE Color = 'Gray';

如果查看两个查询计划,您将看到每个子查询使用外连接,并且连接的顺序与子查询的顺序相同。有一个过滤器应用于颜色的外部连接的结果,以过滤掉颜色不是“灰色”的行。 (对我来说奇怪的是,SQL会对颜色子查询使用外连接,因为我对颜色子查询的结果有一个非空约束,但是没问题。)

滤镜会删除大部分行。结果是查询2比查询1便宜得多,因为第二次连接涉及的行更少。除了构建这样一个陈述的所有理由,这是一种预期的行为吗? SQL Server是否应该选择在查询计划中尽早移动过滤器,而不管子查询的顺序是什么?

编辑:为了澄清,我正在探索这种情况是有正当理由的。我可能需要创建一个涉及类似构造的子查询的视图,现在很明显,基于从视图投影的这些列的任何过滤都会因为列的排序而在性能上有所不同!

2 个答案:

答案 0 :(得分:2)

由于TOP运算符在这里发挥作用,查询优化器对统计数据非常盲目,因此它将寻找有关如何最好地工作的其他线索,例如首先实例化CTE的相关部分。

它是一个外连接,因为如果没有返回任何内容,子查询将被用作NULL,并且系统首先实例化它。如果你使用聚合而不是TOP,你可能会得到一个稍微不同但更一致的计划。

答案 1 :(得分:1)

以下是可能表现更佳的替代版本:

With Colors As
    (
    Select Id, [Color]
        , ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY [LastModified] DESC ) As Num
    From Preference
    Where [Color] Is Not Null
    )
    , Names As
    (
    Select Id, [FirstName]
        , ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY [LastModified] DESC ) As Num
    From Preference
    Where [FirstName] Is Not Null
    )
Select
From Person As P
    Join Colors As C
        On C.Id = P.Id
            And C.Num = 1
    Left Join Names As N
        On N.Id = P.Id
            And N.Num = 1
Where C.[Color]= 'Grey'

另一种更简洁的解决方案,但可能会也可能不会表现得很好:

With RankedItems
    (
    Select Id, [Color], [FirstName]
        , ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY Case When [Color] Is Not Null 1 Else 0 End DESC, [LastModified] DESC ) As ColorRank
        , ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY Case When [FirstName] Is Not Null 1 Else 0 End DESC, [LastModified] DESC ) As NameRank
    From Preference
    )
Select
From Person As P
    Join RankedItems As RI
        On RI.Id = P.Id
            And RI.ColorRank = 1
    Left Join RankedItems As RI2
        On RI2.Id = P.Id
            And RI2.NameRank = 1
Where RI.[Color]= 'Grey'