我得到了一个简单的任务:为表中的每个唯一ID选择“id”,min和max值。所以我写了简单的group by
,但查询需要花费很长时间才能执行(30-60秒)
SELECT CHPDataElement.DataElementID, MIN(CHPDataElementData.UTCDataTime) AS MinDataTime, MAX(CHPDataElementData.UTCDataTime) AS MaxDataTime
FROM CHPDataElement INNER JOIN
CHPDataElementData ON CHPDataElement.DataElementID = CHPDataElementData.DataElementID
GROUP BY CHPDataElement.DataElementID
order by
CHPDataElement.DataElementID
所以我开始致力于改进。并提出了简单的迭代,它在0.3-0.5秒内返回相同的数据。
declare @result table
(
DataElementID int,
MinDataTime datetime NULL,
MaxDataTime datetime null
)
declare @currentID int
declare @nextID int
declare @time datetime
insert into @result (DataElementID, MinDataTime, MaxDataTime)
select DataElementID,null,null from CHPDataElement
order by DataElementID
select top 1 @currentID=DataElementID from @result
while @currentID is not null
begin
print @currentID
select top 1 @time=UTCDataTime from CHPDataElementData
where DataElementID = @currentID
order by UTCDataTime asc
update @result set MinDataTime = @time
where DataElementID = @currentID
select top 1 @time=UTCDataTime from CHPDataElementData
where DataElementID = @currentID
order by UTCDataTime desc
update @result set MaxDataTime = @time
where DataElementID = @currentID
set @nextID = null
select top 1 @nextID=DataElementID from @result where DataElementID > @currentID
set @currentID = @nextID
end
select * from @result
有没有人可以解释为什么'group by'与第二个查询相比效率低?
答案 0 :(得分:0)
为CHPDataElementData提供DataElementID的索引。
答案 1 :(得分:0)
在CHPDataElementData
上为DataElementID, UTCDateTime
添加索引,
CREATE NONCLUSTERED INDEX IX_CHPDataElementData_DataElementID_UTCDataTime
ON CHPDataElementData
(
DataElementID ASC,
UTCDataTime ASC
)
然后使用此声明,
SELECT
CHPDataElementData.DataElementID,
MIN(CHPDataElementData.UTCDataTime) AS MinDataTime,
MAX(CHPDataElementData.UTCDataTime) AS MaxDataTime
FROM
CHPDataElementData
GROUP BY
CHPDataElementData.DataElementID;