Question

我正在通过CTE从自联接的结果更新表。当进行选择以查看更新的预期结果时，它能够在16秒内返回结果。

但是，当我使用相同的语法尝试实际更新基础表时，它的速度非常慢。

对于我的生活，我无法弄清楚为什么会出现如此戏剧性的放缓。我经常通过CTE更新表格，与同等选择相比，它通常非常快速和合理。

我在基础表上尝试过使用和不使用PrimaryKey / Clustered索引，但没有区别。

联接位于计算列上，因此无法编入索引。

如果SELECT和UPDATE之间的时间差异为双倍，则不会引起关注。这里的问题是从select到更新时的时间增长幅度。选择结果为1323行的16秒，更新这些行中的2行需要59秒，更新4需要1分19秒，更新6需要1分39秒（因此似乎每增加一行需要10秒）。

有人可以为我阐明这一点并建议一种方法来加快速度吗？

以下是示例代码：

;WITH CTE AS (SELECT
                DENSE_RANK() OVER (Order by
                                col1,
                                col2,
                                col3) SetID,                
                COUNT(*) OVER (partition by
                                col1,
                                col2,
                                col3) DupsInSet,
                row_number() OVER (PARTITION BY
                                col1,
                                col2,
                                col3
                                ORDER BY                                    
                                col4 desc) RowInSet,
                COUNT(col4) OVER (partition by
                                col1,
                                col2,
                                col3) NonNull,
                *
            FROM mytable)

- 以下内容在16秒内完成并返回1323行

    select b.col4,a.*

    from cte a
    join cte b on b.SetID=a.SetID

    where a.DupsInSet>1
    and a.NonNull>0
    and b.RowInSet=1
    and a.RowInSet>1
    and b.col4 is not null
    and a.col4 is null

- 从此更新运行了很长时间，我甚至没有让它完成 - 作为一项测试，我将更新限制在前2名。然后只花了59秒来更新2行

    UPDATE TOP(2) a

    SET a.col4=b.col4

    from cte a
    join cte b on b.SetID=a.SetID

    where a.DupsInSet>1
    and a.NonNull>0
    and b.RowInSet=1
    and a.RowInSet>1
    and b.col4 is not null
    and a.col4 is null

SELECT https://www.brentozar.com/pastetheplan/?id=ByItSnhuW

的实际执行计划

更新https://www.brentozar.com/pastetheplan/?id=Ske_In3_Z

的实际执行计划

更新

根据SqlZim的建议，它运行了4分多钟而没有完成，我停止了它。

然而，

架构（谢谢SqlZim）从几个VARCHAR（MAX）列到VARCHAR（？）的位置发生变化？是最大值（len（列））

和

对SqlZims建议查询的一些修改，更新能够在23秒内运行！...大约快3,300倍:)）

这是最终查询（除非有人可以在不必列出所有相关列的情况下使其工作，例如加入SetID）：

（注意，CTE仍然用于从600k +行过滤掉原始表格）

;WITH CTE AS (SELECT                             
                COUNT(*) OVER (partition by
                                [col1]
                              ,[col2]
                              ,[col3]
                              ,[col4]
                              ,[col5]
                              ,[col6]
                              ,[col7]
                              ,[col8]
                              ,[col9]
                              ,[col10]
                              ,[col11]
                              ,[col12]
                              ,[col13]
                              ,[col14]
                              ,[col15]
                              ,[col16]
                              ,[col17]
                              ,[col18]
                              ,[col19]
                              ,[col20]
                              ,[col21]
                              ,[col22]
                              ,[col23]) DupsInSet,
                COUNT(col24) OVER (partition by
                                [col1]
                              ,[col2]
                              ,[col3]
                              ,[col4]
                              ,[col5]
                              ,[col6]
                              ,[col7]
                              ,[col8]
                              ,[col9]
                              ,[col10]
                              ,[col11]
                              ,[col12]
                              ,[col13]
                              ,[col14]
                              ,[col15]
                              ,[col16]
                              ,[col17]
                              ,[col18]
                              ,[col19]
                              ,[col20]
                              ,[col21]
                              ,[col22]
                              ,[col23]) NonNull,
                *
            FROM mytable
    )

update a
  set a.col24 = b.col24
from cte a
  cross apply (
    select top 1 i.col24
    from cte i
    where (i.col1=a.col1 OR (i.col1 is null AND a.col1 is null))
         and (i.col2=a.col2 OR (i.col2 is null AND a.col2 is null))
         and (i.col3=a.col3 OR (i.col3 is null AND a.col3 is null))
         and (i.col4=a.col4 OR (i.col4 is null AND a.col4 is null))
         and (i.col5=a.col5 OR (i.col5 is null AND a.col5 is null))
         and (i.col6=a.col6 OR (i.col6 is null AND a.col6 is null))
         and (i.col7=a.col7 OR (i.col7 is null AND a.col7 is null))
         and (i.col8=a.col8 OR (i.col8 is null AND a.col8 is null))
         and (i.col9=a.col9 OR (i.col9 is null AND a.col9 is null))
         and (i.col10=a.col10 OR (i.col10 is null AND a.col10 is null))
         and (i.col11=a.col11 OR (i.col11 is null AND a.col11 is null))
         and (i.col12=a.col12 OR (i.col12 is null AND a.col12 is null))
         and (i.col13=a.col13 OR (i.col13 is null AND a.col13 is null))
         and (i.col14=a.col14 OR (i.col14 is null AND a.col14 is null))
         and (i.col15=a.col15 OR (i.col15 is null AND a.col15 is null))
         and (i.col16=a.col16 OR (i.col16 is null AND a.col16 is null))
         and (i.col17=a.col17 OR (i.col17 is null AND a.col17 is null))
         and (i.col18=a.col18 OR (i.col18 is null AND a.col18 is null))
         and (i.col19=a.col19 OR (i.col19 is null AND a.col19 is null))
         and (i.col20=a.col20 OR (i.col20 is null AND a.col20 is null))
         and (i.col21=a.col21 OR (i.col21 is null AND a.col21 is null))
         and (i.col22=a.col22 OR (i.col22 is null AND a.col22 is null))
         and (i.col23=a.col23 OR (i.col23 is null AND a.col23 is null))
        and i.col24 is not null
    order by col24 desc
  ) b
where a.col24 is null
and a.DupsInSet>1
and a.NonNull>0

Answer 1

我可以告诉您要更新col4与null上匹配的行col1, col2, col3，并且您想要使用第一个非null值col4基于col4 desc。

你可以这样做：

update a
  set a.col4 = b.col4
from mytable a
  cross apply (
    select top 1 i.col4
    from mytable i
    where i.col1 = a.col1
      and i.col2 = a.col2
      and i.col3 = a.col3
      and i.col4 is not null
    order by col4 desc
  ) b
where a.col4 is null

您还可以使用支持索引支持此操作，例如：

create nonclustered index ix_mytable_col1_col2_col3_inc_col4
  on dbo.mytable (col1,col2,col3) 
    include (col4);

来自CTE的更新

1 个答案: