Question

我正在使用Lucene索引CMS的内容，因此我扩展了SQL Server数据库架构以添加“IsIndexed”位列，因此Lucene索引器可以找到一段内容已被索引。

我在Content表中添加了一个索引，因此IsIndexed列的查找速度会更快。这就是数据库的样子：

CREATE TABLE Content (
    DocumentId bigint,
    CategoryId bigint,
    Title nvarchar(255),
    AuthorUserId bigint,
    Body nvarchar(MAX),
    IsIndexed bit
)
CREATE TABLE Users (
    UserId bigint,
    UserName nvarchar(20)
)

存在以下索引：

Content (
    PK_Content (Clustered) : DocumentId ASC
    IX_CategoryId (Non-Unique, Non-Clustered) : CategoryId ASC
    IX_AuthorUserId (Non-Unique, Non-Clustered) : AuthorUserId ASC
    IX_Indexed_ASC (Non-Unique, Non-Clustered) : IsIndexed ASC, DocumentId ASC
    IX_Indexed_DESC (Non-Unique, Non-Clustered) : IsIndexed DESC, DocumentId ASC
)

Users (
    PK_Users (Clustered) : UserId
)

这是用于查找未编制索引的内容的查询：

SELECT
    TOP 1
    Content.DocumentId,
    Content.CategoryId,
    Content.Title,
    Content.AuthorUserId,
    Content.Body
    Users.UserName
FROM
    Content
    INNER JOIN Users ON Content.AuthorUserId = Users.UserId
WHERE
    IsIndexed = 0

但是，当我运行它时，实际执行计划会报告PK_Content的聚集索引扫描以及PK_Users的聚簇索引搜索。查询大约需要300毫秒才能执行。

当我修改查询以删除Users.UserName字段和Users内连接时，查询大约需要60ms才能运行，并且PK_Content没有Clustered Index Scan，只有Clustered Index寻找PK_Content。

我在为Content.IsIndexed列添加降序索引之前和之后尝试了这一点，我还在IX_Indexed索引中添加了Content.DocumentId，但没有区别。

我做错了什么？我已经制作了所有必要的索引（然后是一些）。内容表有数十万行，类似于Users表，所以我看不出优化器选择扫描的原因。

Answer 1

使用IsIndexed字段和AuthorUserId字段向Content添加索引，然后应该进行搜索。根据您的SQL Server版本，您可以添加INCLUDE语句，其中包含您在选择中使用的字段，以获得更快的速度。

IX_Indexed_AuthorUserId（非唯一，非群集）：IsIndexed， AuthorUserId

Answer 2

总是会忽略这种低选择性列（只有两个值0和1）的索引，请参阅the tipping point。一种选择是将其作为聚集索引中最左侧的键移动，并使DocumentId上的主键约束成为非聚集索引：

CREATE TABLE Content (
    DocumentId bigint,
    CategoryId bigint,
    Title nvarchar(255),
    AuthorUserId bigint,
    Body nvarchar(MAX),
    IsIndexed bit,
    constraint pk_DocumentId primary key nonclustered (DocumentId)
)

create unique clustered index cdxContent on Content (IsIndexed, DocumentId);

另一种选择是创建过滤的覆盖索引：

create unique index nonIndexedContent on Content (DocumentId)
  include (CategoryId, Title, AuthorUserId, Body)
  where IsIndexed = 0;

这第二个选项可能会复制很多内容。就个人而言，我会选择第一个选项。

SQL Server - 查询执行索引扫描而不是搜索

2 个答案: