Question

我有以下查询。 DataStaging是一张有1000万行的大表。

Product i一个较小的表有1000行。

我们需要使用product_id找到ref，并且ref (ref1,ref2)表上有两个Product，因此必须加入该表两次。

    UPDATE  dbo.DataStaging
    SET     ProductId = COALESCE(Ref1_Pid, Ref2_Pid, 0) 
    FROM    dbo.DataStaging s
            LEFT JOIN ( SELECT  id [Ref1_Pid] ,
                                Ref1
                        FROM    dbo.Product
                        WHERE   isActive = 1
                      ) p1 ON s.[Ref] = p1.Ref1
            LEFT JOIN ( SELECT  id [Ref2_Pid] ,
                                Ref2
                        FROM    dbo.Product
                        WHERE   IsActive = 1
                      ) p2 ON s.[Ref] = p1.Ref2

    WHERE   s.TypeId = 1
            AND s.StatusId = 2

这是产品表PK_Product上的主键，我可以自由添加Non_Clustered Index。

（1）三个指标：NC_index on（IsActive），NC_Index on（Ref1），NC_Index on（Ref2）

（2）两个复合索引：NC_Index on（IsActive，Ref1），NC_Index on（IsActive，Ref2）

（3）一个综合指数：NC_Index on（IsActive，Ref1，Ref2）

表示（1）它使用主键PK_Product扫描表，但不扫描NC索引。

对于（2）它在每个索引上使用NC_index Scan。

for（3）它在同一索引上使用NC_index Scan，但行大小是（2）的两倍

结果，性能（2）> （3）＆gt; （1）

我的问题是，

为什么不（1）扫描NC索引？

如果我创建像（2）或（3）这样的索引会有什么缺点？

假设上述查询是Product中最重要的进程，但有stored procs个数product使用select表，where语句使用不同的{{1}}条件。是（2）仍然是一种比（3）更好的方法，即使上述查询的性能是（2）＆gt; （3）？

（暂时忽略dataStaging上的索引）

Answer 1

（1）将要求IsActive上的索引的索引连接和Ref1 / Ref2上的索引连接，它认为不太理想。

我会在（2）上找到一个变体 - 两个过滤的索引包含：

create index IX_Product_Ref1 on Product (Ref1) include(id) where (IsActive = 1)
create index IX_Product_Ref2 on Product (Ref2) include(id) where (IsActive = 1)

如果你一起查询IsActive，Ref1和Ref2，那么

（3）只是个好主意。

另外，你不能这样写你的查询吗？

UPDATE  dbo.DataStaging
    SET     ProductId = isnull(p.id, p2.id) 
FROM    dbo.DataStaging s
LEFT JOIN dbo.Product p ON s.[Ref] = p.Ref1 and p.IsActive = 1
LEFT JOIN dbo.Product p2 ON s.[Ref] = p2.Ref2 and p2.IsActive = 1
WHERE s.TypeId = 1
AND s.StatusId = 2

综合指数，单一指数和重复列指数

1 个答案: