我在SQL Server 2014(Developer Edition,x64,Windows 10 Pro x64)中有一个包含5亿行的表:
CREATE TABLE TestTable
(
ID BIGINT IDENTITY(1,1),
PARENT_ID BIGINT NOT NULL,
CONSTRAINT PK_TestTable PRIMARY KEY CLUSTERED (ID)
);
CREATE NONCLUSTERED INDEX IX_TestTable_ParentId
ON TestTable (PARENT_ID);
我正在尝试应用以下补丁:
-- Create non-nullable column with default (should be online operation in Enterprise/Developer edition)
ALTER TABLE TestTable
ADD ORDINAL TINYINT NOT NULL CONSTRAINT DF_TestTable_Ordinal DEFAULT 0;
GO
-- Populate column value for existing data
BEGIN
SET NOCOUNT ON;
DECLARE @BATCH_SIZE BIGINT = 1000000;
DECLARE @COUNTER BIGINT = 0;
DECLARE @ROW_ID BIGINT;
DECLARE @ORDINAL BIGINT;
DECLARE ROWS_C CURSOR
LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY
FOR
SELECT
ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM
TestTable;
OPEN ROWS_C;
FETCH NEXT FROM ROWS_C
INTO @ROW_ID, @ORDINAL;
BEGIN TRANSACTION;
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE TestTable
SET
ORDINAL = CAST(@ORDINAL AS TINYINT)
WHERE
ID = @ROW_ID;
FETCH NEXT FROM ROWS_C
INTO @ROW_ID, @ORDINAL;
SET @COUNTER = @COUNTER + 1;
IF @COUNTER = @BATCH_SIZE
BEGIN
COMMIT TRANSACTION;
SET @COUNTER = 0;
BEGIN TRANSACTION;
END;
END;
COMMIT TRANSACTION;
CLOSE ROWS_C;
DEALLOCATE ROWS_C;
SET NOCOUNT OFF;
END;
GO
-- Drop default constraint from the column
ALTER TABLE TestTable
DROP CONSTRAINT DF_TestTable_Ordinal;
GO
-- Drop IX_TestTable_ParentId index
DROP INDEX IX_TestTable_ParentId
ON TestTable;
GO
-- Create IX_TestTable_ParentId_Ordinal index
CREATE UNIQUE INDEX IX_TestTable_ParentId_Ordinal
ON TestTable (PARENT_ID, ORDINAL);
GO
补丁的目的是添加一个名为ORDINAL的列,该列是同一父级(由PARENT_ID定义)中的记录的序号。该补丁使用SQLCMD运行。
补丁以这种方式完成,原因如下:
这个补丁在一个有几百万行的小型数据库上运行得很好,但是,当应用于有数十亿行的那个时,我得到:
Msg 3606,Level 16,State 2,Server XXX,Line 22
发生算术溢出。
我的第一个猜测是ORDINAL值太大而不适合TINYINT列,但事实并非如此。我创建了一个具有类似结构的测试数据库,并填充了数据(每个父项超过255行)。我得到的错误消息仍然是算术异常,但使用不同的消息代码和不同的措辞(明确说它不能将数据放入TINYINT)。
目前我有一些怀疑,但我找不到任何可以帮助我的事情:
你对这个问题有什么看法吗?
答案 0 :(得分:1)
如何使用While循环,但要确保将相同的parent_id保持在一起:
DECLARE @SegmentSize BIGINT = 1000000
DECLARE @CurrentSegment BigInt = 0
WHILE 1 = 1
BEGIN
;With UpdateData As
(
SELECT ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM TestData
WHERE ID > @CurrentSegment AND ID <= (@CurrentSegment + @SegmentSize)
)
UPDATE TestData
SET Ordinal = UpdateDate.Ordinal
FROM TestData
INNER JOIN UpdateData ON TestData.Id = UpdateData.Id
IF @@ROWCOUNT = 0
BEGIN
BREAK
END
SET @CurrentSegment = @CuurentSegment + @SegmentSize
END
编辑 - 修改为根据请求对Parent_Id进行细分。这应该是 Parent_id被索引后合理快速(添加选项(重新编译) 确保将实际值用于查找。 因为你没有更新 整个表这将限制事务日志的增长!
DECLARE @SegmentSize BIGINT = 1000000
DECLARE @CurrentSegment BigInt = 0
WHILE 1 = 1
BEGIN
;With UpdateData As
(
SELECT ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM TestData
WHERE Parent_ID > @CurrentSegment AND
Parent_ID <= (@CurrentSegment + @SegmentSize)
)
UPDATE TestData
SET Ordinal = UpdateDate.Ordinal
FROM TestData
INNER JOIN UpdateData ON TestData.Id = UpdateData.Id
OPTION (RECOMPILE)
IF @@ROWCOUNT = 0
BEGIN
BREAK
END
SET @CurrentSegment = @CuurentSegment + @SegmentSize
END