此查询导致我们的事务日志增长到25GB。数据库处于SIMPLE模式。
INSERT INTO updbl.dbo.PopulationRelatives
( personid,
personsex,
relativeid,
relativesex,
degree,
relationship,
maternalpaternal )
SELECT DISTINCT
personid = relative1,
relative1sex,
relative2,
relative2sex,
degree,
relationship = Rel1Rel2,
maternalpaternal
FROM UPDBwork.dbo.DegreeRelationship
通过循环,我能够将增长限制在8GB。
SELECT @PID = 0, @BatchSize = 1000000, @ROWCOUNT = 0
SELECT @MaxPID = MAX(relative1) FROM updbwork.dbo.DegreeRelationship
WHILE @PID < @MaxPID+@BatchSize
BEGIN
INSERT INTO updbl.dbo.PopulationRelatives
( personid,
personsex,
relativeid,
relativesex,
degree,
relationship,
maternalpaternal )
SELECT DISTINCT
personid = relative1,
relative1sex,
relative2,
relative2sex,
degree,
relationship = Rel1Rel2,
maternalpaternal
FROM UPDBwork.dbo.DegreeRelationship
WHERE relative1 BETWEEN @PID+1 AND @PID+@BatchSize
SET @PID = @PID + @BatchSize
CHECKPOINT
END
这不是最好的策略,因为每个循环根据DISTINCT值产生不同的行数。不幸的是,没有好的ID来分区数据。有什么方法可以控制每组的大小?我正在考虑添加TOP(X),但引擎仍然需要进行大量计算才能满足DISTINCT语句。光标会很棒但是又如何找到我的DISTINCT值?我只是希望在这里有一些头脑风暴。 感谢。
答案 0 :(得分:0)
听起来像批量操作...如果更改恢复模型是一个选项暂时将其更改为批量记录。以下链接可能有所帮助:http://technet.microsoft.com/en-us/library/ms175987(v=SQL.105).aspx