我发现自己需要从远程数据库中检索平均约1.5米的匹配项。有两个表(ITEM1和ITEM2)具有日期项目信息。 ITEM1中的项目应始终至少有一条记录,ITEM2中的同一项目可能有0到多条记录。我必须从任一表中找到最新记录,如果它存在于ITEM2中,请使用该信息而不是ITEM1。 #TEMPA是具有初始~1.5m ItemNumbers的表。
以下是查询:
SELECT GETDATE() AS DateElement, A.SourceStore, COALESCE(FR.original_cost,CO.original_cost) AS Cost
FROM #TEMPA A
INNER JOIN REMOTEDB.ITEM1 CO
ON CO.item_id = A.ItemNumber
AND CO.month_ending >= (SELECT MAX(month_ending) FROM REMOTEDB.ITEM1 CO2 WHERE CO2.item_id = A.ItemNumber)
LEFT JOIN REMOTEDB.ITEM2 FR
ON FR.item_id = A.ItemNumber
AND FR.month_ending >= (SELECT MAX(month_ending) FROM REMOTEDB.ITEM2 FR2 WHERE FR2.item_id = A.ItemNumber)
WHERE CO.item_id IS NOT NULL
OR FR.item_id IS NOT NULL
两个ITEM表上的item_id和month_ending都有唯一的聚簇索引。我意识到子查询可能是一个很大的性能影响,但我想不出有任何其他方法可以做到这一点。每个项目可能具有不同的最大month_ending日期。目前它返回正确的信息,但这需要大约2.6小时。任何有关优化此查询以更好地执行的帮助都将受到赞赏。
编辑:我应该提一下,查询也正在运行READ UNCOMMITTED。
我使用ROW_NUMBER尝试了两个回答查询,并且它们都在远程服务器本身上运行了大约20分钟。使用我的原始查询它在约2分钟内完成。 我的原始查询在链接服务器上运行约17分钟。一旦他们超过一小时我就取消了其他查询。
思想?
谢谢!
答案 0 :(得分:3)
使用MAX和ROW_NUMBERs重写相关子查询:
SELECT GETDATE() AS DateElement, A.SourceStore,
COALESCE(FR.original_cost,CO.original_cost) AS Cost
FROM #TEMPA A
INNER JOIN
(
SELECT *
FROM
(
SELECT original_cost,
item_id,
ROW_NUMBER() OVER (PARTITIOM BY item_id ORDER BY month_ending DESC) AS rn
FROM REMOTEDB.ITEM1
) as dt
WHERE rn = 1
) AS CO
ON CO.item_id = A.ItemNumber
LEFT JOIN
(
SELECT *
FROM
(
SELECT original_cost,
item_id,
ROW_NUMBER() OVER (PARTITIOM BY item_id ORDER BY month_ending DESC) AS rn
FROM REMOTEDB.ITEM2
) as dt
WHERE rn = 1
) as FR
ON FR.item_id = A.ItemNumber
答案 1 :(得分:1)
如果是SQL Server 2008或更高版本,请尝试此操作...
;With OrderedItem1 As
(
Select Row_Number() Over (Partition By item_id Order By Month_Ending Desc) As recentOrderID,
item_id,
original_cost
From REMOTEDB.ITEM1
), OrderedItem2 As
(
Select Row_Number() Over (Partition By item_id Order By Month_Ending Desc) As recentOrderID,
item_id,
original_cost
From REMOTEDB.ITEM2
), maxItem1 As
(
Select item_id,
original_cost
From OrderedItem1
Wher recentOrderID = 1
), maxItem2 As
(
Select item_id,
original_cost
From OrderedItem2
Wher recentOrderID = 1
)
Select GetDate() As DateElement,
A.SourceStore,
IsNull(FR.original_cost,CO.original_cost) As Cost
From #TEMPA As A
Join maxItem1 As CO
On CO.item_id = A.ItemNumber
Left Join maxItem2 FR
On FR.item_id = A.ItemNumber
...你在原帖中提到ITEM1中的每个项目总会有一条记录,所以你的WHERE CO.item_id Is Not Null OR FR.item_id Is Not Null
什么都不做(事实上你会用内连接过滤掉它们)。
答案 2 :(得分:0)
经过多次测试和实验后,我得出的结果优于我尝试过的其他内容:
SELECT DISTINCT oInv.Item_ID, oInv.Month_Ending, oInv.Original_Cost
FROM (
SELECT Item_ID, Month_Ending, Original_Cost
FROM ho_data.dbo.CO_Ho_Inven
UNION ALL
SELECT Item_ID, Month_Ending, Original_Cost
FROM ho_data.dbo.FR_Ho_Inven
) OInv
INNER JOIN (
SELECT UInv.Item_ID, MAX(UInv.Month_ending) AS Month_Ending, MAX(original_cost) AS original_cost
FROM (
SELECT Item_ID, Month_Ending, original_cost
FROM ho_data.dbo.CO_Ho_Inven
UNION ALL
SELECT Item_ID, Month_Ending, original_cost
FROM ho_data.dbo.FR_Ho_Inven
) UInv
GROUP BY UInv.Item_ID
) UINv
ON OInv.Item_ID = UInv.Item_ID
AND OInv.Month_Ending = UInv.Month_Ending
AND OInv.original_cost = UINv.original_cost