SQL Server(2014)来自IN(列表)变体的奇怪行为

时间:2017-11-07 22:01:21

标签: sql-server union-all

我一直在尝试优化(或至少更改)C#中的某些EF代码以使用存储过程,并在找到与常量列表匹配的行时发现似乎是异常(或者我的新东西) 。 典型的手动生成的短查询类似于......

SELECT Something FROM Table WHERE ID IN (one, two, others);

我们有一个EF查询,我们用存储过程调用替换,所以我查看输出,看到它很复杂,并认为我更简单的查询(类似于上面)会更好。事实并非如此。这是一个快速演示,可以重现这一点。

任何人都可以解释为什么最终版本的执行计划 - 使用

...WHERE EXISTS(... (SELECT 1 AS X) AS Alias UNION ALL...) AS Alias...)

构造更好 - 看起来因为它省略了昂贵的SORT操作,即使该计划包含两个索引扫描而不是一个更简单的查询。

这是一个自包含的示例脚本(我希望)......

USE SandBox;  -- a dummy database, not a live one!
-- create our dummy table, dropping first if it exists
IF EXISTS (SELECT NULL FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Test')
    DROP TABLE Test;
CREATE TABLE Test (Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY, FormId INT NOT NULL, DateRead DATE NULL);
-- populate with some data
INSERT INTO Test VALUES (1, NULL), (1, GETDATE()), (1, NULL), (4, NULL), (5, NULL), (6, GETDATE());

-- Simple query that I might typically use
-- how many un-read entries are there for a set of 'forms' of interest, 1, 5 and 6
-- (we're happy to omit forms with none)
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test AS T
 WHERE  T.FormId IN (1, 5, 6)
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- This is the first step towards the EF-generated code
-- using an EXISTS gives basically the same plan but with constants
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (VALUES (1), (5), (6)
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- A step closer, using UNION ALL instead of VALUES to generate the 'table'
-- still the same plan
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (    SELECT 1 
                                UNION ALL
                                SELECT 5 
                                UNION ALL
                                SELECT 6 
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- Now what the EF actually generated (cleaned up a bit)
-- Adding in the "FROM (SELECT 1 as X) AS alias" changes the execution plan considerably and apparently costs less to run
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (    SELECT 1 FROM (SELECT 1 AS X) AS X1
                                UNION ALL
                                SELECT 5 FROM (SELECT 1 AS X) AS X2
                                UNION ALL
                                SELECT 6 FROM (SELECT 1 AS X) AS X3
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

任何人都可以帮助我理解为什么以及这种查询格式是否有更广泛使用的好处?

我在(SELECT 1 AS X)内找到了一些特别的东西,虽然很多人认为它在EF输出中很常见,但我看不出有关这种特殊明显好处的任何内容。

提前致谢,

基思

1 个答案:

答案 0 :(得分:1)

最后一个查询中每个索引扫描背后的谓词是范围ID> = 1且id< = 6且DateRead IS NULL

range between 1 and 6

我确实添加了另一个"选择1"它确实创建了另一个索引扫描。看起来每个(选择1)在字面上都被视为一个表,尽管UNION ALL将它组成一个表。

在所有先前的查询中,索引扫描的谓词是一组DateRead IS NULL,后跟OR

1 or 5 or 6

以为我会把它添加到混合中:

declare @tmp table (formid int not null primary key)
insert into @tmp values (1),(5),(6);

SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM @tmp X
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

但那还包括一种。

dbfiddle.uk允许访问完整的showplan xml(有点繁琐)所以如果感兴趣: dbfiddle是here