我正在使用更新连接查询来更新一些记录。我实际上是将索引表连接到自身,并更新满足模式的位置。
此查询适用于大约一百万条记录,但有1400万条记录,它只是没有扩展。我这样做的原因是因为我所知道的唯一其他选择是使用光标,这本来是非常残酷的。
现在查询运行时间超过12小时。任何帮助找到更好的方法来做到这一点将非常感激。我正在使用SQL Server Management Studio。对于下面的查询,以下是在AIS_Positions表中创建索引的方式:
CREATE INDEX SID ON AIS_Positions (Id)
UPDATE R1
SET
BOUNDARY = 'BERTH',
TRAVEL_MODE = 'HOTEL',
BerthStartFlag = 'YES',
BerthStartTime = R1.IntervalStart,
BerthEndTime = R2.IntervalEnd,
BerthStart_ID = R1.Id,
BerthEnd_ID = R2.Id
FROM
AIS_Positions R1
INNER JOIN
AIS_Positions R2 ON R1.MMSI = R2.MMSI
AND R1.ID < R2.ID
AND R1.IntervalSpeed <= 0.1
AND R2.IntervalSpeed <= 0.1
AND DATEDIFF(HOUR, R1.POSITIONTIME, R2.POSITIONTIME) BETWEEN 1 AND 72
AND (SELECT TOP 1 IntervalSpeed
FROM AIS_Positions
WHERE MMSI = R1.MMSI AND ID = R1.ID-1) > 0.1
AND (SELECT TOP 1 IntervalSpeed
FROM AIS_Positions
WHERE MMSI = R1.MMSI AND ID = R2.ID+1) > 0.1
AND (SELECT TOP 1 Boundary
FROM AIS_Positions
WHERE MMSI = R1.MMSI AND ID = R1.ID-1) IS NULL
答案 0 :(得分:1)
这可能是一个好的开始:
/*
create nonclustered index [ix_ais_positions_mmsi_inc] on ais_positions
(mmsi)
include (id, intervalspeed, boundary, PositionTime, IntervalStart, IntervalEnd);
*/
update R1 set
boundary = 'berth',
travel_mode = 'hotel',
BerthStartFlag = 'yes',
BerthStartTime = R1.IntervalStart,
BerthEndTime = R2.IntervalEnd,
BerthStart_id = R1.Id,
BerthEnd_id = R2.Id
from ais_positions R1
inner join ais_positions R2
on R1.mmsi = R2.mmsi
and R1.id < R2.id
--How many matches does R1.id < R2.id yield? Is this updating the same row more than once?
and R1.IntervalSpeed <= 0.1
and R2.IntervalSpeed <= 0.1
--and datediff(hour, R1.positiontime, R2.positiontime) between 1 and 72
and datediff(hour, R1.positiontime, R2.positiontime) >= 1 and datediff(hour, R1.positiontime, R2.positiontime) <= 72
--and (select top 1 IntervalSpeed from ais_positions where mmsi = R1.mmsi and id = R1.id-1) > 0.1
and exists (select 1 from ais_positions i where i.mmsi = R1.mmsi and i.id = R1.id-1 and i.IntervalSpeed > 0.1 and i.Boundary is null)
--and (select top 1 IntervalSpeed from ais_positions where mmsi = R1.mmsi and id = R2.id+1) > 0.1
and exists (select 1 from ais_positions where mmsi = R1.mmsi and id = R2.id+1 and IntervalSpeed > 0.1)
--and (select top 1 Boundary from ais_positions where mmsi = R1.mmsi and id=R1.id-1) is null
答案 1 :(得分:1)
您是否考虑过使用临时表来查询子查询的条件?您的查询可能正在为它们上面的查询的每一行运行子查询。也许是这样的:
SELECT A1.ID, A1.IntervalSpeed as topint1
INTO #Int_tabl_1
FROM AIS_Positions as A1
INNER JOIN AIS_Positions as A2
ON A1.MMSI = A2.MMSI AND A1.ID = A2.ID -1
SELECT A1.ID, A1.IntervalSpeed as topint2
INTO #Int_tabl_2
FROM AIS_Positions as A1
INNER JOIN AIS_Positions as A2
ON A1.MMSI = A2.MMSI AND A1.ID = A2.ID+1
SELECT A1.ID, A1.Boundary
INTO #Bound_tbl
FROM AIS_Positions as A1
INNER JOIN AIS_Positions as A2
ON A1.MMSI = A2.MMSI AND A1.ID = A2.ID-1
然后测试
topint1 > 0.1
,topint2 > 0.1
和Boundary is null