所以我正在开发一些数据库'去识别',其中每条信息都会发生变化。在大多数较小的表上,简单的更新并不太耗时(通过10,000行左右的表格。我现在已经转移到大约500,000行的表。
我已经读过,实现这种“更新”的最快方法实际上就是选择更新所需列的临时表。 (我在这里阅读。Fastest way to update 120 Million records)
这个问题是OP正在使用单个值更新所有类似值,其中每个值都不同,即他将单个列中的空行更新为-1,我正在更新每个列我的新行或多或少是一个随机的日期;这就是我到目前为止所做的。
--The only Index on Treatments is a Clustered Primary Key (TreatmentID)
SELECT * INTO #Treatments_temp
FROM Treatments
CREATE CLUSTERED INDEX IDX_Treatments ON #Treatments_temp(TreatmentID)
SET @rows = (SELECT TOP 1 TreatmentID
FROM Treatments
ORDER BY TreatmentID Desc)
WHILE (@rows > 0)
BEGIN
--There are only 500,000 records in this table from count(*) but the PK is much
--higher (some records are deleted, made in error ETC so this if statement is my
--attempt to bypass the code for @rows that don't actually exist.
IF (SELECT TreatmentID FROM #Treatments_temp WHERE TreatmentID = @rows) IS NOT NULL
BEGIN
DECLARE @year INT;
DECLARE @month INT;
DECLARE @date INT;
DECLARE @newStartDate SMALLDATETIME;
DECLARE @multiplier FLOAT;
SET @multiplier = (SELECT RAND());
SET @year = @multiplier * 99 + 1900;
SET @month = @multiplier * 11 + 1;
SET @date = @multiplier * 27 + 1;
SET @newStartDate = DATEADD(MONTH,((@year-1900)*12)+@month-1,@date-1);
UPDATE #Treatments_temp
SET StartDate = @newStartDate
WHERE TreatmentID = @rows
UPDATE #Treatments_temp
SET EndDate = DATEADD(MINUTE, @timebetween, @newStartDate)
WHERE TreatmentID = @rows
END
SET @rows = @rows - 1
END
答案 0 :(得分:2)
如果不了解您拥有的内容,我认为最简单的方法是:
ID
和每个ID
Treatment
更新您的INNER JOIN
表格以获取新值不需要逐行的方法。
答案 1 :(得分:1)
我认为这应该有效:
-- using NewID() instead of Rand() because Rand() is only interpreted once for the entre query while NewID() is for each record
-- Based on your logic I understand newStartDate had to be between 1 jan 1801 and 28 dec 1999
DECLARE @multiplier float
DECLARE @max_int float
DECLARE @daterange float
SELECT @max_int = Power(Convert(float, 2), 31), -- signed int !
@daterange = DateDiff(day, '1 jan 1801', '28 dec 1999')
UPDATE Treatments
SET @multiplier = (@max_int - Convert(real, ABS(BINARY_CHECKSUM(NewID())))) / @max_int, -- returns something between 0 and 1
StartDate = DateAdd(day, Convert(int, (@daterange * @multiplier)), '1 jan 1801') -- returns somewhere in the daterange
-- test 'spread'
SELECT COUNT(*), COUNT(DISTINCT StartDate), Min(StartDate), Max(StartDate) FROM Treatments
如果有人想测试这个,你可以使用它来生成一些测试数据(@Kulingar:确保不要意外丢弃你的桌子=)
IF DB_ID('test') IS NULL CREATE DATABASE test
GO
USE test
GO
IF Object_ID('test..Treatments') IS NOT NULL DROP TABLE test..Treatments
GO
SELECT row_id = IDENTITY(int, 1, 1), StartDate = CURRENT_TIMESTAMP INTO Treatments FROM sys.columns, sys.objects
答案 2 :(得分:0)
我完成了一个小程序的写作:
通过这种方式,您可以在数据库外部执行逻辑,并将crud限制为必需的。