使用SQL Server

时间:2016-01-18 00:02:21

标签: sql-server

概述

我正在将UnitsOrdered列从int转换为浮点数。我的策略是使用新数据类型创建一个新列,并用旧列的数据填充它。这将在具有数亿条记录的表格上执行。为了使事务保持较小,我希望将进程分块,循环一些固定数量的行。

问题

我发现的问题是更新逐渐变慢。我希望有一个简单的解释(并因此修复)。我的预感是更新第1-5行将与6-10相同,速度与1000001-1000005相同,但情况似乎并非如此。

我尝试过的事情:

  • 增加日志大小(例如,增加到25GB)
  • 正在更改option(recompile)with(tablock)
  • 首先和期间(例如每1000万条记录)重建统计数据
  • 使用cte,不使用cte,在范围之间插入等
  • 更改一次更新的记录数

备注

  • UnitsOrdered
  • 上没有索引
  • OrderItemID
  • 上有一个群集主键
  • 很高兴提供更多信息。表的架构是超级香草(按设计)......大致是:

    Create table dbo.OrderItems
    (
         OrderItemID int primary key identity, 
         UnitsOrdered int NOT NULL, 
         UnitsOrdered1 real NOT NULL
    )
    

SQL:

print 'Populating column data...';
GO
set nocount on;
declare 
    @totalRows int = 0,
    @affectedRows int = 0,
    @rowsFetched int = 1000000,
    @rowsProcessed int = 0,
    @statusMessage nvarchar(100),
    @time datetime,
    @refreshEvery int = 10000000

select @totalRows = Count(*)
from dbo.OrderItems

while(1 = 1) begin

    set @time = getUtcDate();

    ;with cte as(
        select *
        from dbo.OrderItems with(tablock)
        order by 
            OrderItemID

        offset @rowsProcessed ROWS
        fetch next @rowsFetched ROWS ONLY
    )

    update cte
    set
        UnitsOrdered1 = UnitsOrdered
        option(RECOMPILE) 


    set @affectedRows = @@ROWCOUNT;

    if(@affectedRows = 0) begin
        break;
    end

    --increment processed rows
    set @rowsProcessed = @rowsProcessed + @affectedRows;

    --set status message (%% is escaped)
    set @statusMessage = Concat(    '->',
                                    @affectedRows,
                                    ' rows updated in ',
                                    datediff(s,@time,getUtcDate()),
                                    ' seconds. ',
                                    (Cast(@rowsProcessed as float) / Cast(NULLIF(@totalRows,0) as float))*100,
                                    '%% complete...');

    --we use raiseerror so we can output the message instead of buffering it.
    raiserror(@statusMessage,0,1) with nowait

end
GO

进度:

    Populating column data...
    ->1000000 rows updated in 1 seconds. 2.61037% complete...
    ->1000000 rows updated in 3 seconds. 5.22074% complete...
    ->1000000 rows updated in 2 seconds. 7.83111% complete...
    ->1000000 rows updated in 3 seconds. 10.4415% complete...
    ->1000000 rows updated in 3 seconds. 13.0519% complete...
    ->1000000 rows updated in 4 seconds. 15.6622% complete...
    ->1000000 rows updated in 4 seconds. 18.2726% complete...
    ->1000000 rows updated in 4 seconds. 20.883% complete...
    ->1000000 rows updated in 4 seconds. 23.4933% complete...
    ->1000000 rows updated in 5 seconds. 26.1037% complete...
    ->1000000 rows updated in 9 seconds. 28.7141% complete...
    ->1000000 rows updated in 5 seconds. 31.3245% complete...
    ->1000000 rows updated in 6 seconds. 33.9348% complete...
    ->1000000 rows updated in 6 seconds. 36.5452% complete...
    ->1000000 rows updated in 6 seconds. 39.1556% complete...
    ->1000000 rows updated in 7 seconds. 41.7659% complete...
(etc. for several million records)

    ->1000000 rows updated in 71 seconds. 84.3763% complete...
    ->1000000 rows updated in 74 seconds. 86.9867% complete...
    ->1000000 rows updated in 87 seconds. 89.597% complete...
    ->1000000 rows updated in 92 seconds. 92.2074% complete...

非常感谢!

1 个答案:

答案 0 :(得分:2)

预计会逐渐缓慢。这是任何行数分页技术的固有问题。通过更新OrderItemID值的范围,您可能会通过批量更新获得更好,更一致的性能,以便列中的索引可用于有效地定位和触摸那些需要更新的行。