假设我的Oracle数据库中有一个非常大的表,其中包含数千个项目的数据。这些数据会在一天中非常频繁地更新,每次更新都会得到一个时间戳。
因此,例如,该表如下所示(我知道列名称不好,这只是插图):
TBLDaily:
Date: ItemNo: CharA: .... CharN: Time_Stamp:
2014/02/15 123 .... 2014/02/15 10:00AM
2014/02/15 123 .... 2014/02/15 11:00AM
2014/02/15 123 .... 2014/02/15 02:13PM
2014/02/15 234 .... 2014/02/20 01:00PM
2014/02/15 234 .... 2014/02/20 09:00PM
...
2014/02/16 123 .... 2014/02/20 08:15PM
...
然后,我有一个具有相同项目编号的表,用于存储其他信息,但它在整个月内保持静态,因此它看起来如下:
TBLMonthly:
Date: ItemNo: CharA: .... CharK:
2014/01/31 123 ....
2014/01/31 234 ....
2013/12/31 123 ....
2013/12/31 234 ....
...
现在,我需要获得每个零件编号和对于每个日期,每日表格中提供的最新信息,以及某些特征,如果它们不存在,则从月表中获取它们。
我的SQL查询如下所示:
WITH All_Data AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY A.Date, A.ItemNo ORDER BY A.Time_Stamp) AS RN,
A.Date, A.ItemNo,
NVL(A.CharA, B.CharA),
B.CharB,
... whatever other characteristics ...
FROM
TBLDaily A,
TBLMonthly B,
WHERE
A.ItemNo = B.ItemNo
AND
A.Date BETWEEN To_Date('2012-12-31', 'yyyy-MM-dd') AND To_Date('2014-02-24', 'yyyy-MM-dd')
AND
B.Date = (SELECT max(Date) FROM TBLMonthly WHERE Date <= A.Date)
)
SELECT *
FROM All_Data
WHERE RN = 1
ORDER BY Date, ItemNo
现在,这个查询需要很长时间非常完成(我从昨天下午开始运行它并且今天早上仍在执行查询)。我知道,这是一个非常大的数据集,但我已经大大加快了查询更大的数据集。我猜测这是由于:
PARTITION BY
或B.Date = (SELECT max(Date) FROM TBLMonthly WHERE Date <= A.Date)
但是我不确定,更糟糕的是,我不知道如何修复它以提高效率而不需要这么长时间。
非常感谢任何想法/帮助!!
答案 0 :(得分:2)
使用这种方法也许您的查询更容易,更快捷:
with t AS
(SELECT DISTINCT LAST_VALUE(CharA) OVER (PARTITION BY Date, ItemNo ORDER BY Time_Stamp) as CharA,
MAX(Time_Stamp) OVER (PARTITION BY Date, ItemNo) as Time_Stamp
FROM TBLDaily)
SELECT *
FROM t
JOIN TBLMonthly m ON m.ItemNo = d.ItemNo and t.Time_Stamp = m.Time_Stamp
答案 1 :(得分:1)
也许您可以在每日表上创建虚拟列。应该是这样的:
CREATE OR REPLACE FUNCTION Is_latest(V_item IN NUMBER, V_MONTH IN DATE, V_time_stamp IN DATE) RETURN DATE IS
last_ts DATE;
BEGIN
SELECT MAX(time_stamp)
INTO last_ts
FROM TBLDaily
WHERE ItemNo = V_item
AND DATE = V_MONTH;
IF last_ts = V_time_stamp THEN
RETURN trunc(last_ts, 'mm')
ELSE
RETURN NULL;
END IF;
END;
ALTER TABLE TBLDaily ADD month_of_TS GENERATED ALWAYS AS (Is_latest(ItemNo, Date, time_stamp));
CREATE INDEX IND_XXX on TBLDaily (ItemNo, month_of_TS);
Select *
from TBLDaily d
JOIN TBLMonthly m ON m.ItemNo = d.ItemNo and m.Date = d.month_of_TS