Question

我已经阅读了很多关于它们之间的索引和差异的内容。现在我正在我的项目中进行查询优化。我创建了非聚集索引，应该用于查询执行，但事实并非如此。详情如下：

表：

enter image description here

指数：

CREATE NONCLUSTERED INDEX [_IXProcedure_Deleted_Date] ON [por].[DailyAsset]
(
    [Deleted] ASC,
    [Date] DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

实体框架生成的查询：

exec sp_executesql N'SELECT 
[Project1].[C1] AS [C1], 
[Project1].[AssetId] AS [AssetId], 
[Project1].[Active] AS [Active], 
[Project1].[Date] AS [Date]
FROM ( SELECT 
    [Extent1].[AssetId] AS [AssetId], 
    [Extent1].[Active] AS [Active], 
    [Extent1].[Date] AS [Date], 
    1 AS [C1]
    FROM [por].[DailyAsset] AS [Extent1]
    WHERE (0 = [Extent1].[Deleted]) AND ([Extent1].[Date] < @p__linq__0)
)  AS [Project1]
ORDER BY [Project1].[Date] DESC',N'@p__linq__0 datetime2(7)',@p__linq__0='2014-05-01 00:00:00'

执行计划：

enter image description here

缺少索引详细信息：

The Query Processor estimates that implementing the following index could improve the query cost by 23.8027%.


CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [por].[DailyAsset] ([Deleted],[Date])
INCLUDE ([AssetId],[Active])

我知道，如果将AssetId和Active列包含在索引中，将使用index。

现在，为什么没有列包含就无法工作？

这是另一个查询的简化示例，其中所有列都作为结果被提取。（强制）索引寻求使用的唯一解决方案是包括索引中的所有列，它具有相同的估计子树成本（显而易见）。

另一个恼人的问题是Sort ignorance。日期列位于索引中并设置为DESCENDING。它被完全忽略，当然，排序操作在执行计划中占据了昂贵的位置。

更新1：

正如@Jayachandran所指出的，IndexSeek + KeyLookUp应该在上面的查询中使用，但覆盖索引有很好的文档记录，并且它假定应该包含AssetId和Active列。我同意这一点。

我正在创建UPDATE 1以在下面的查询中演示覆盖索引的有用性。相同的表，更大的结果集。据我所知，不应该在索引中使用单个列，并且为Date和Deleted列创建索引。

exec sp_executesql N'SELECT 
[Project1].[DailyAssetId] AS [DailyAssetId], 
[Project1].[AssetId] AS [AssetId], 
[Project1].[CreatedByUserId] AS [CreatedByUserId], 
[Project1].[UpdatedByUserId] AS [UpdatedByUserId], 
[Project1].[TimeCreated] AS [TimeCreated], 
[Project1].[TimeUpdated] AS [TimeUpdated], 
[Project1].[Deleted] AS [Deleted], 
[Project1].[TimeDeleted] AS [TimeDeleted], 
[Project1].[DeletedByUserId] AS [DeletedByUserId], 
[Project1].[Active] AS [Active], 
[Project1].[Date] AS [Date], 
[Project1].[Quantity] AS [Quantity], 
[Project1].[TotalBookValue] AS [TotalBookValue], 
[Project1].[CostPrice] AS [CostPrice], 
[Project1].[CostValue] AS [CostValue], 
[Project1].[FairPrice] AS [FairPrice], 
[Project1].[FairValue] AS [FairValue], 
[Project1].[UnsettledQuantity] AS [UnsettledQuantity], 
[Project1].[UnsettledValue] AS [UnsettledValue], 
[Project1].[SettlementDate] AS [SettlementDate], 
[Project1].[EffectiveDate] AS [EffectiveDate], 
[Project1].[PortfolioId] AS [PortfolioId]
FROM ( SELECT 
    [Extent1].[DailyAssetId] AS [DailyAssetId], 
    [Extent1].[AssetId] AS [AssetId], 
    [Extent1].[CreatedByUserId] AS [CreatedByUserId], 
    [Extent1].[UpdatedByUserId] AS [UpdatedByUserId], 
    [Extent1].[TimeCreated] AS [TimeCreated], 
    [Extent1].[TimeUpdated] AS [TimeUpdated], 
    [Extent1].[Deleted] AS [Deleted], 
    [Extent1].[TimeDeleted] AS [TimeDeleted], 
    [Extent1].[DeletedByUserId] AS [DeletedByUserId], 
    [Extent1].[Active] AS [Active], 
    [Extent1].[Date] AS [Date], 
    [Extent1].[Quantity] AS [Quantity], 
    [Extent1].[TotalBookValue] AS [TotalBookValue], 
    [Extent1].[CostPrice] AS [CostPrice], 
    [Extent1].[CostValue] AS [CostValue], 
    [Extent1].[FairPrice] AS [FairPrice], 
    [Extent1].[FairValue] AS [FairValue], 
    [Extent1].[UnsettledQuantity] AS [UnsettledQuantity], 
    [Extent1].[UnsettledValue] AS [UnsettledValue], 
    [Extent1].[SettlementDate] AS [SettlementDate], 
    [Extent1].[EffectiveDate] AS [EffectiveDate], 
    [Extent1].[PortfolioId] AS [PortfolioId]
    FROM [por].[DailyAsset] AS [Extent1]
    WHERE (0 = [Extent1].[Deleted]) AND ([Extent1].[Date] < @p__linq__0)
)  AS [Project1]
ORDER BY [Project1].[Date] DESC',N'@p__linq__0 datetime2(7)',@p__linq__0='2014-05-01 00:00:00'

Answer 1

在这种情况下，扫描和搜索（使用键查找）的区别在于返回的行数。音量太大，因此优化器选择了更便宜的计划 - 只需扫描整个表格。这比使用NC索引要快。

想象一下，如果你强迫它使用NC索引，它必须对表中40％的行进行密钥查找。这就像执行多次的foreach循环一样。所以SQL选择只扫描表，因为它会比循环更快。

关于如何考虑可能包含在其他查询中的其他列的问题，实际上有几个选择。您可以创建包含最常用列的覆盖索引，也可以更改主键以使其朝向最常见的访问路径。即按日期，删除和唯一性的标识列。

另一方面，使用guid作为主键会导致聚簇索引和所有其他索引出现各种问题（因为PK的密钥将包含在所有其他索引中）。 guids的随机排序会导致行以随机顺序插入页面中。由于索引是有序的，因此必须不断拆分页面以便考虑新行。创建一个自然递增的索引会好得多，这也可能有助于解决上面的问题，具体取决于所写的查询类型。

Answer 2

特定查询的理想索引是（1）WHERE子句中的所有字段都在索引中，以及（2）{{中的所有字段1}}子句包含在索引中。如果不满足（1），SQL Server将权衡访问多个索引的成本并选择它认为最快的索引;如果不满足if（2），则意味着昂贵的Key Lookup操作。如果索引具有非常高的选择性（很少重复的值），SQL Server 可能认为它是值得的。

在您的情况下，显然不符合条件（2）。 SQL Server认为密钥查找操作与聚簇索引扫描相比过于昂贵，因此它选择了后者。您可以强制SQL Server使用特定索引，但我不知道如何使用Entity Framework执行此操作。

如果此查询必须快速，请按SQL Server所述创建索引。

SQL Server：未使用NonClustered索引

2 个答案: