请注意,为了便于阅读,我修改了表/字段名称等。一些原始名称令人困惑。
我有三个不同的表:
Retailer (Id+Code is a unique key)
- Id
- Code
- LastReturnDate
- ...
Delivery/DeliveryHistory (combination of Date+RetailerId is unique)
- Date
- RetailerId
- HasReturns
- ...
Delivery
和DeliveryHistory
几乎完全相同。数据会定期移动到历史记录表中,并且没有万无一知的方法可以知道最后一次发生的时间。通常,Delivery-table非常小 - 通常少于100,000行 - 而历史表通常会有数百万行。
我的任务是根据Delivery 或 DeliveryHistory中LastReturnDate
为真的当前最高日期值更新每个零售商的HasReturns
字段。
以前,这已经通过以下定义解决:
SELECT Id, Code, MAX(Date) Date
FROM Delivery
WHERE HasReturns = 1
GROUP BY Id, Code
UNION
SELECT Id, Code, MAX(Date) Date
FROM DeliveryHistory
WHERE HasReturns = 1
GROUP BY Id, Code
以下UPDATE语句:
UPDATE Retailer SET LastReturnDate = (
SELECT MAX(Date) FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHERE Code = :Code AND EXISTS (
SELECT * FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code
HAVING
MAX(Date) > LastReturnDate OR
(LastReturnDate IS NULL AND MAX(Date) IS NOT NULL))
EXISTS子句防止更新当前值大于新值的字段,但这实际上不是一个重要的问题,因为很难看出在正常的程序执行过程中如何发生这种情况。另请注意AND Max(Date) IS NOT NULL
部分实际上是多余的,因为在DeliveryView中Date不可能为空。但EXISTS条款似乎实际上略微提高了性能。
然而,UPDATE的表现最近是可怕的。在零售商表仅包含1000-2000个相关条目的数据库中,UPDATE运行时间超过五分钟。请注意,即使我删除整个EXISTS子句,即使用这个非常简单的语句,它也会这样做:
UPDATE Retailer SET LastReturnDate = (
SELECT MAX(Date) FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHERE Code = :Code
因此,我一直在寻找更好的解决方案。我的第一个想法是创建一个临时表,但过了一段时间我试着把它写成MERGE语句:
MERGE INTO Retailer
USING (SELECT Id, Code, MAX(Date) Date FROM DeliveryView GROUP BY Id, Code)
ON (Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHEN MATCHED THEN
UPDATE SET LastReturnDate = Date WHERE Code = :Code
这似乎有效,并且比UPDATE快一个数量级。
我有三个问题:
费用:25,831,字节数:1,143,828
Plan Cardinality Distribution
14 MERGE STATEMENT REMOTE ALL_ROWS
Cost: 25 831 Bytes: 1 143 828 3 738
13 MERGE SCHEMA.Retailer ORCL
12 VIEW SCHEMA.
11 HASH JOIN
Cost: 25 831 Bytes: 1 192 422 3 738
9 VIEW SCHEMA.
Cost: 25 803 Bytes: 194 350 7 475
8 SORT GROUP BY
Cost: 25 803 Bytes: 194 350 7 475
7 VIEW VIEW SCHEMA.DeliveryView ORCL
Cost: 25 802 Bytes: 194 350 7 475
6 SORT UNIQUE
Cost: 25 802 Bytes: 134 550 7 475
5 UNION-ALL
2 SORT GROUP BY
Cost: 97 Bytes: 25 362 1 409
1 TABLE ACCESS FULL TABLE SCHEMA.Delivery [Analyzed] ORCL
Cost: 94 Bytes: 210 654 11 703
4 SORT GROUP BY
Cost: 25 705 Bytes: 109 188 6 066
3 TABLE ACCESS FULL TABLE SCHEMA.DeliveryHistory [Analyzed] ORCL
Cost: 16 827 Bytes: 39 333 636 2 185 202
10 TABLE ACCESS FULL TABLE SCHEMA.Retailer [Analyzed] ORCL
Cost: 27 Bytes: 653 390 2 230
费用:101,492,字节:272,060
Plan Cardinality Distribution
14 UPDATE STATEMENT REMOTE ALL_ROWS
Cost: 101 492 Bytes: 272 060 1 115
13 UPDATE SCHEMA.Retailer ORCL
1 TABLE ACCESS FULL TABLE SCHEMA.Retailer [Analyzed] ORCL
Cost: 27 Bytes: 272 060 1 115
12 VIEW SCHEMA.
Cost: 90 Bytes: 52 2
11 SORT GROUP BY
Cost: 90 Bytes: 52 2
10 VIEW VIEW SCHEMA.DeliveryView ORCL
Cost: 90 Bytes: 52 2
9 SORT UNIQUE
Cost: 90 Bytes: 36 2
8 UNION-ALL
4 SORT GROUP BY
Cost: 15 Bytes: 18 1
3 TABLE ACCESS BY INDEX ROWID TABLE SCHEMA.Delivery [Analyzed] ORCL
Cost: 14 Bytes: 108 6
2 INDEX RANGE SCAN INDEX SCHEMA.DeliveryHasReturns [Analyzed] ORCL
Cost: 2 12
7 SORT GROUP BY
Cost: 75 Bytes: 18 1
6 TABLE ACCESS BY INDEX ROWID TABLE SCHEMA.DeliveryHistory [Analyzed] ORCL
Cost: 74 Bytes: 4 590 255
5 INDEX RANGE SCAN INDEX SCHEMA.DeliveryHistoryHasReturns [Analyzed] ORCL
Cost: 6 509