我目前正在试验SQL Server中的过滤索引。我试图通过将BOL中的以下提示付诸实践来缩小过滤后的索引:
过滤后的索引表达式中的列不需要是键或 如果筛选的索引包含筛选索引定义中的列 表达式等同于查询谓词,而查询则不相同 使用查询返回筛选的索引表达式中的列 结果
我在一个小的测试脚本中重现了这个问题: 我的表格如下:
CREATE TABLE #test
(
ID BIGINT NOT NULL IDENTITY(1,1),
ARCHIVEDATE DATETIME NULL,
CLOSINGDATE DATETIME NULL,
OBJECTTYPE INTEGER NOT NULL,
ACTIVE BIT NOT NULL,
FILLER1 CHAR(255) DEFAULT 'just a filler',
FILLER2 CHAR(255) DEFAULT 'just a filler',
FILLER3 CHAR(255) DEFAULT 'just a filler',
FILLER4 CHAR(255) DEFAULT 'just a filler',
FILLER5 CHAR(255) DEFAULT 'just a filler',
CONSTRAINT test_pk PRIMARY KEY CLUSTERED (ID ASC)
);
我需要优化以下查询:
SELECT
COUNT(*)
FROM
#test
WHERE
ARCHIVEDATE IS NULL
AND CLOSINGDATE IS NOT NULL
AND ISNULL(ACTIVE,1) != 0
因此我构建了以下过滤索引:
CREATE NONCLUSTERED INDEX idx_filterTest ON #test (/*ARCHIVEDATE ASC,*/CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;
ARCHIVEDATE已经在过滤器中,不会在SELECT中使用,因此它不包含在索引键或包含中。
ARCHIVEDATE的聚簇索引中有一个键查找。为什么会这样?我在SQL Server 2008和SQL Server 2016上重现了这种行为。
如果我在键中使用ARCHIVEDATE创建索引,我只需要索引搜索就可以了。所以在我看来,因为BOL中的这一段并不总是适用。
这是我完整的复制脚本:
--DROP TABLE #test;
CREATE TABLE #test
(
ID BIGINT NOT NULL IDENTITY(1,1),
ARCHIVEDATE DATETIME NULL,
CLOSINGDATE DATETIME NULL,
OBJECTTYPE INTEGER NOT NULL,
ACTIVE BIT NOT NULL,
FILLER1 CHAR(255) DEFAULT 'just a filler',
FILLER2 CHAR(255) DEFAULT 'just a filler',
FILLER3 CHAR(255) DEFAULT 'just a filler',
FILLER4 CHAR(255) DEFAULT 'just a filler',
FILLER5 CHAR(255) DEFAULT 'just a filler',
CONSTRAINT test_pk PRIMARY KEY CLUSTERED (ID ASC)
);
INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 200
NULL,
dates.calcDate,
4711,
dates.number%2
FROM
(
SELECT
/* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
DATEADD(DAY, seq.number, '20120101') AS calcDate, number
FROM
(
/* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
SELECT
a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
FROM
( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
) seq
WHERE
/* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
seq.number <= 5000
) dates
ORDER BY
dates.number
;
INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 1000
dates.calcDate + 3,
dates.calcDate,
4711,
dates.number%2
FROM
(
SELECT
/* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
DATEADD(DAY, seq.number, '20120101') AS calcDate, number
FROM
(
/* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
SELECT
a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
FROM
( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
) seq
WHERE
/* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
seq.number <= 5000
) dates
ORDER BY
dates.number
;
INSERT INTO #test
(ARCHIVEDATE, CLOSINGDATE, OBJECTTYPE, ACTIVE)
SELECT TOP 100000
dates.calcDate,
NULL,
4711,
dates.number%2
FROM
(
SELECT
/* Erzeugen des Datums durch Addieren der jeweiligen Sequenznummer zum StartDate */
DATEADD(DAY, seq.number, '20120101') AS calcDate, number
FROM
(
/* Abfrage zur Erstellung einer Nummernsequenz von 0 bis 9999. Dient als Basis zur Aufbereitung aller Datumswerte im Zeitraum. Die Sequenz reicht für einen Zeitraum von ca. 30 Jahren aus. */
SELECT
a.num * 1000 + b.num * 100 + c.num * 10 + d.num AS number
FROM
( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
CROSS JOIN ( SELECT 0 AS num UNION ALL SELECT 1 AS num UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
) seq
WHERE
/* Einschränkung der Nummernsequenz auf die Anzahl der Tage im gewünschten Aufbereitungszeitraum */
seq.number <= 5000
) dates
ORDER BY
dates.number
;
--DROP INDEX idx_filterTest ON #test;
--CREATE NONCLUSTERED INDEX idx_filterTest ON #test (ARCHIVEDATE ASC,CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;
CREATE NONCLUSTERED INDEX idx_filterTest ON #test (/*ARCHIVEDATE ASC,*/CLOSINGDATE ASC) INCLUDE (ACTIVE) WHERE ARCHIVEDATE IS NULL;
SELECT
COUNT(*)
FROM
#test
WHERE
ARCHIVEDATE IS NULL
AND CLOSINGDATE IS NOT NULL
AND ISNULL(ACTIVE,1) != 0;
答案 0 :(得分:2)
这是优化器中的一个错误,特别是它处理IS NULL
过滤器的方式。这是一个更简单的复制品:
CREATE TABLE #T(ID INT IDENTITY PRIMARY KEY, X INT);
INSERT #T(X) SELECT TOP(10000) message_id FROM sys.messages WHERE message_id <> 1;
INSERT #T(X) VALUES (1);
INSERT #T(X) VALUES (NULL);
CREATE INDEX IX_#T_X_null ON #T(ID) WHERE X IS NULL;
CREATE INDEX IX_#T_X_1 ON #T(ID) WHERE X = 1;
显然,IX_#T_X_null
涵盖了以下查询:
SELECT MIN(ID) FROM #T WHERE X IS NULL;
优化器确实选择了它,但我们得到了一个执行计划,其中插入了多余的聚簇索引查找。但是:
SELECT MIN(ID) FROM #T WHERE X = 1;
现在我们得到一个没有聚簇索引查询的查询。当涉及IS NULL
时,优化器似乎认识到过滤的索引适用,但是无法将条件传播到后面的步骤。如果我们包含索引列:
CREATE INDEX IX_#T_X_null ON #T(ID, X) WHERE X IS NULL;
如果您现在比较WHERE X = 1
和WHERE X IS NULL
查询的执行计划,您会在X IS NULL
的情况下看到,优化程序会在索引中添加谓词扫描,它与X = 1
没有关系。
进一步深入研究,通过此特定设置,您可以发现这是一个known issue, already reported on Connect。然而,根据微软的说法,这实际上不是一个错误,而是一个已知的功能差距&#34; (我认为这在技术上是正确的,因为结果不正确,它只是没有表现得那么好)。此外,&#34;现在这是SQL Server&#34;的未来版本的活动DCR,但那是在6年前,并且该票据已关闭,因为&#34;赢得了修复&#34; - 所以不要屏住呼吸。
不幸的是,解决方法确实是在索引中包含列 - 我将它作为包含列而不是键,因为这会增加非叶级别的开销:
CREATE NONCLUSTERED INDEX idx_filterTest ON #test (CLOSINGDATE ASC)
INCLUDE (ACTIVE, ARCHIVEDATE)
WHERE ARCHIVEDATE IS NULL;
我说&#34;不幸的是&#34;因为总是 - NULL
列仍然会毫无意义地占用行空间(因为DATETIME
是固定大小的数据类型)。即便如此,它可能比从聚集索引查找中获得额外的I / O要好几英里。此外,compressing the index可以将开销减少到几乎为零(甚至行压缩也可以)。