在我的计算数据层中,我尝试在订单时填写客户的邮政编码,正在填充的表的子样本如下:
CustomerOrders
(
CustomerID varchar(20),
...
OrderDate date,
...
CustomerPostcodeAtTimeOfOrder varchar(10)
)
此表是Customers表,Orders表和CustomerAddress表的连接,如下所示:
CustomerAddress
(
CustomerID varchar(20),
AddressType varchar(10),
/*
AddressDetails
*/
StartDate date,
EndDate date,
AddressRank int
)
可以想象,客户可能已经记录了单个日期的各种类型的地址,因此填充CustomerOrders表时的意图是加入如下:
SELECT *
FROM Customers c
LEFT JOIN Orders o
ON o.CustomerID = c.CustomerID
OUTER APPLY
(
SELECT TOP 1 Postcode
FROM CustomerAddress ca
WHERE ca.CustomerID = c.CustomerID
AND o.OrderDate BETWEEN ca.StartDate AND ca.EndDate
ORDER BY AddressRank
)
然而,通过将此连接添加到查询中我获得的性能损失意味着返回1000行从花费4秒到花费106秒。
请注意,我也在Address表上添加了非聚集索引。其定义如下:
CREATE NONCLUSTERED INDEX (IX_CustomerAddress)
ON CustomerAddress (StartDate, EndDate)
INCLUDE (AddressRank, CustomerID, Postcode)
我正在寻找有关解决此问题的最佳方法的任何建议吗?
答案 0 :(得分:0)
我不完全确定这是否会更快地返回结果,但您可以像这样重写查询:
;WITH OrderAddress AS
(
SELECT o.*,
ca.Postcode,
RN = ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY AddressRank DESC)
FROM CustomerAddress ca
INNER JOIN Orders o
ON ca.CustomerID = c.CustomerID
AND o.OrderDate BETWEEN ca.StartDate AND ca.EndDate
)
SELECT *
FROM Customers c
LEFT JOIN ( SELECT *
FROM OrderAddress
WHERE RN = 1) o
ON o.CustomerID = c.CustomerID;
您还应该在地址表上发布索引定义。