大表上的算术溢出

时间:2016-04-07 10:23:14

标签: sql sql-server database cursor sql-server-2014

我在SQL Server 2014(Developer Edition,x64,Windows 10 Pro x64)中有一个包含5亿行的表:

CREATE TABLE TestTable
(
  ID BIGINT IDENTITY(1,1),
  PARENT_ID BIGINT NOT NULL,
  CONSTRAINT PK_TestTable PRIMARY KEY CLUSTERED (ID)
);

CREATE NONCLUSTERED INDEX IX_TestTable_ParentId
ON TestTable (PARENT_ID);

我正在尝试应用以下补丁:

-- Create non-nullable column with default (should be online operation in Enterprise/Developer edition)
ALTER TABLE TestTable
ADD ORDINAL TINYINT NOT NULL CONSTRAINT DF_TestTable_Ordinal DEFAULT 0;
GO

-- Populate column value for existing data
BEGIN

  SET NOCOUNT ON;

  DECLARE @BATCH_SIZE BIGINT = 1000000;
  DECLARE @COUNTER BIGINT = 0;

  DECLARE @ROW_ID BIGINT;
  DECLARE @ORDINAL BIGINT;

  DECLARE ROWS_C CURSOR
    LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY
  FOR 
    SELECT
      ID AS ID,
      ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
    FROM
      TestTable;

  OPEN ROWS_C;

  FETCH NEXT FROM ROWS_C
  INTO @ROW_ID, @ORDINAL;

  BEGIN TRANSACTION;

  WHILE @@FETCH_STATUS = 0
  BEGIN

    UPDATE TestTable
    SET
      ORDINAL = CAST(@ORDINAL AS TINYINT)
    WHERE
      ID = @ROW_ID;

    FETCH NEXT FROM ROWS_C
    INTO @ROW_ID, @ORDINAL;

    SET @COUNTER = @COUNTER + 1;

    IF @COUNTER = @BATCH_SIZE
    BEGIN
      COMMIT TRANSACTION;
      SET @COUNTER = 0;
      BEGIN TRANSACTION;
    END;

  END;

  COMMIT TRANSACTION;

  CLOSE ROWS_C;
  DEALLOCATE ROWS_C;

  SET NOCOUNT OFF;

END;
GO

-- Drop default constraint from the column
ALTER TABLE TestTable
DROP CONSTRAINT DF_TestTable_Ordinal;
GO

-- Drop IX_TestTable_ParentId index
DROP INDEX IX_TestTable_ParentId
ON TestTable;
GO

-- Create IX_TestTable_ParentId_Ordinal index
CREATE UNIQUE INDEX IX_TestTable_ParentId_Ordinal
ON TestTable (PARENT_ID, ORDINAL);
GO

补丁的目的是添加一个名为ORDINAL的列,该列是同一父级(由PARENT_ID定义)中的记录的序号。该补丁使用SQLCMD运行。

补丁以这种方式完成,原因如下:

  • 表太大而无法在其上运行单个UPDATE语句(在事务日志/ tempdb中占用大量时间和空间)。
  • 使用带有TOP n行的单个UPDATE语句进行批量更新并不容易实现(如果我们更新表格,比如1m行批次,1000001st行可能属于同一个PARENT_ID,即1000000th,这将导致分配错误的序号到1000001st记录)。换句话说,游标中运行的SELECT语句应该运行一次(不分页)或者应该应用更复杂的操作(连接/条件)。
  • 添加NULL列并稍后将其更改为NOT NULL不是一个好的解决方案,因为我使用SNAPSHOT隔离(将对更改列执行全表更新为NOT NULL)。

这个补丁在一个有几百万行的小型数据库上运行得很好,但是,当应用于有数十亿行的那个时,我得到:

  

Msg 3606,Level 16,State 2,Server XXX,Line 22
  发生算术溢出。

我的第一个猜测是ORDINAL值太大而不适合TINYINT列,但事实并非如此。我创建了一个具有类似结构的测试数据库,并填充了数据(每个父项超过255行)。我得到的错误消息仍然是算术异常,但使用不同的消息代码和不同的措辞(明确说它不能将数据放入TINYINT)。

目前我有一些怀疑,但我找不到任何可以帮助我的事情:

  • CURSOR无法处理超过MAX(INT32)行。
  • SQLCMD施加了限制。

你对这个问题有什么看法吗?

1 个答案:

答案 0 :(得分:1)

如何使用While循环,但要确保将相同的parent_id保持在一起:

DECLARE @SegmentSize BIGINT = 1000000
DECLARE @CurrentSegment BigInt = 0

WHILE 1 = 1
BEGIN

    ;With UpdateData  As
    (
        SELECT  ID AS ID,
                ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
        FROM TestData
        WHERE ID > @CurrentSegment AND ID <= (@CurrentSegment + @SegmentSize)
    )
    UPDATE TestData 
        SET Ordinal = UpdateDate.Ordinal
    FROM TestData
    INNER JOIN UpdateData ON TestData.Id = UpdateData.Id    

    IF @@ROWCOUNT = 0
    BEGIN
        BREAK
    END

    SET @CurrentSegment = @CuurentSegment + @SegmentSize
END 
  

编辑 - 修改为根据请求对Parent_Id进行细分。这应该是   Parent_id被索引后合理快速(添加选项(重新编译)   确保将实际值用于查找。   因为你没有更新   整个表这将限制事务日志的增长!

DECLARE @SegmentSize BIGINT = 1000000
DECLARE @CurrentSegment BigInt = 0

WHILE 1 = 1
BEGIN

    ;With UpdateData  As
    (
        SELECT  ID AS ID,
                ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
        FROM TestData
        WHERE Parent_ID > @CurrentSegment AND
              Parent_ID <= (@CurrentSegment + @SegmentSize)
    )
    UPDATE TestData 
        SET Ordinal = UpdateDate.Ordinal
    FROM TestData
    INNER JOIN UpdateData ON TestData.Id = UpdateData.Id
    OPTION (RECOMPILE)  

    IF @@ROWCOUNT = 0
    BEGIN
        BREAK
    END

    SET @CurrentSegment = @CuurentSegment + @SegmentSize
END