SQL Server缺失谓词中的筛选索引无法按预期工作

时间:2016-08-29 07:25:01

标签: sql-server indexing filtered-index

我目前正在试验SQL Server中的过滤索引。我试图通过将BOL中的以下提示付诸实践来缩小过滤后的索引:

  

过滤后的索引表达式中的列不需要是键或   如果筛选的索引包含筛选索引定义中的列   表达式等同于查询谓词,而查询则不相同   使用查询返回筛选的索引表达式中的列   结果

我在一个小的测试脚本中重现了这个问题: 我的表格如下:

CREATE TABLE #test
(
    ID  BIGINT NOT NULL IDENTITY(1,1),
    ARCHIVEDATE DATETIME NULL,
    CLOSINGDATE DATETIME NULL,
    OBJECTTYPE INTEGER NOT NULL,
    ACTIVE BIT NOT NULL,
    FILLER1 CHAR(255) DEFAULT 'just a filler',
    FILLER2 CHAR(255) DEFAULT 'just a filler',
    FILLER3 CHAR(255) DEFAULT 'just a filler',
    FILLER4 CHAR(255) DEFAULT 'just a filler',
    FILLER5 CHAR(255) DEFAULT 'just a filler',
    CONSTRAINT test_pk PRIMARY KEY CLUSTERED (ID ASC)
);

我需要优化以下查询:

SELECT  
    COUNT(*) 
FROM    
    #test 
WHERE       
        ARCHIVEDATE IS NULL 
    AND CLOSINGDATE IS NOT NULL 
    AND ISNULL(ACTIVE,1) != 0

因此我构建了以下过滤索引:

CREATE NONCLUSTERED INDEX idx_filterTest ON #test (/*ARCHIVEDATE ASC,*/CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;

ARCHIVEDATE已经在过滤器中,不会在SELECT中使用,因此它不包含在索引键或包含中。

但是,如果我运行查询,我会得到以下计划: plan for query filters for operators

ARCHIVEDATE的聚簇索引中有一个键查找。为什么会这样?我在SQL Server 2008和SQL Server 2016上重现了这种行为。

如果我在键中使用ARCHIVEDATE创建索引,我只需要索引搜索就可以了。所以在我看来,因为BOL中的这一段并不总是适用。

这是我完整的复制脚本:

--DROP TABLE #test;
CREATE TABLE #test
(
    ID  BIGINT NOT NULL IDENTITY(1,1),
    ARCHIVEDATE DATETIME NULL,
    CLOSINGDATE DATETIME NULL,
    OBJECTTYPE INTEGER NOT NULL,
    ACTIVE BIT NOT NULL,
    FILLER1 CHAR(255) DEFAULT 'just a filler',
    FILLER2 CHAR(255) DEFAULT 'just a filler',
    FILLER3 CHAR(255) DEFAULT 'just a filler',
    FILLER4 CHAR(255) DEFAULT 'just a filler',
    FILLER5 CHAR(255) DEFAULT 'just a filler',
    CONSTRAINT test_pk PRIMARY KEY CLUSTERED (ID ASC)
);



INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 200
    NULL,
    dates.calcDate,
    4711,
    dates.number%2
FROM
    (
        SELECT
            /* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
            DATEADD(DAY, seq.number, '20120101') AS calcDate, number
        FROM
        (
            /* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
            SELECT
                a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
            FROM
                        ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
        ) seq 
        WHERE
            /* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
            seq.number <= 5000
    ) dates
ORDER BY
    dates.number
;



INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 1000
    dates.calcDate + 3,
    dates.calcDate,
    4711,
    dates.number%2
FROM
    (
        SELECT
            /* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
            DATEADD(DAY, seq.number, '20120101') AS calcDate, number
        FROM
        (
            /* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
            SELECT
                a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
            FROM
                        ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
        ) seq 
        WHERE
            /* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
            seq.number <= 5000
    ) dates
ORDER BY
    dates.number
;


INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 100000
    dates.calcDate,
    NULL,
    4711,
    dates.number%2
FROM
    (
        SELECT
            /* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
            DATEADD(DAY, seq.number, '20120101') AS calcDate, number
        FROM
        (
            /* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
            SELECT
                a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
            FROM
                        ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
            CROSS JOIN  ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
        ) seq 
        WHERE
            /* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
            seq.number <= 5000
    ) dates
ORDER BY
    dates.number
;


--DROP INDEX idx_filterTest ON #test;
--CREATE NONCLUSTERED INDEX idx_filterTest ON #test (ARCHIVEDATE ASC,CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;
CREATE NONCLUSTERED INDEX idx_filterTest ON #test (/*ARCHIVEDATE ASC,*/CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;



SELECT  
    COUNT(*) 
FROM    
    #test 
WHERE       
        ARCHIVEDATE IS NULL 
    AND CLOSINGDATE IS NOT NULL 
    AND ISNULL(ACTIVE,1) != 0;

1 个答案:

答案 0 :(得分:2)

这是优化器中的一个错误,特别是它处理IS NULL过滤器的方式。这是一个更简单的复制品:

CREATE TABLE #T(ID INT IDENTITY PRIMARY KEY, X INT);
INSERT #T(X) SELECT TOP(10000) message_id FROM sys.messages WHERE message_id <> 1;
INSERT #T(X) VALUES (1);
INSERT #T(X) VALUES (NULL);
CREATE INDEX IX_#T_X_null ON #T(ID) WHERE X IS NULL;
CREATE INDEX IX_#T_X_1 ON #T(ID) WHERE X = 1;

显然,IX_#T_X_null涵盖了以下查询:

SELECT MIN(ID) FROM #T WHERE X IS NULL;

优化器确实选择了它,但我们得到了一个执行计划,其中插入了多余的聚簇索引查找。但是:

SELECT MIN(ID) FROM #T WHERE X = 1;

现在我们得到一个没有聚簇索引查询的查询。当涉及IS NULL时,优化器似乎认识到过滤的索引适用,但是无法将条件传播到后面的步骤。如果我们包含索引列:

,我们可以清楚地看到这一点
CREATE INDEX IX_#T_X_null ON #T(ID, X) WHERE X IS NULL;

如果您现在比较WHERE X = 1WHERE X IS NULL查询的执行计划,您会在X IS NULL的情况下看到,优化程序会在索引中添加谓词扫描,它与X = 1没有关系。

进一步深入研究,通过此特定设置,您可以发现这是一个known issue, already reported on Connect。然而,根据微软的说法,这实际上不是一个错误,而是一个已知的功能差距&#34; (我认为这在技术上是正确的,因为结果不正确,它只是没有表现得那么好)。此外,&#34;现在这是SQL Server&#34;的未来版本的活动DCR,但那是在6年前,并且该票据已关闭,因为&#34;赢得了修复&#34; - 所以不要屏住呼吸。

不幸的是,解决方法确实是在索引中包含列 - 我将它作为包含列而不是键,因为这会增加非叶级别的开销:

CREATE NONCLUSTERED INDEX idx_filterTest ON #test (CLOSINGDATE ASC)
INCLUDE (ACTIVE, ARCHIVEDATE) 
WHERE ARCHIVEDATE IS NULL;

我说&#34;不幸的是&#34;因为总是 - NULL列仍然会毫无意义地占用行空间(因为DATETIME是固定大小的数据类型)。即便如此,它可能比从聚集索引查找中获得额外的I / O要好几英里。此外,compressing the index可以将开销减少到几乎为零(甚至行压缩也可以)。