我正在使用1个日历月的分区窗口执行查询。我正在使用的数据是定期收集的,例如每十五分钟。
代码如下:
SELECT AVG(data_value) OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN INTERVAL '1' MONTH PRECEDING AND CURRENT ROW)
此查询效果很好,并收集每月平均值。唯一的问题是间隔的开始和结束彼此相距恰好一个月,因此间隔窗口的边界是包含边界的,例如。开始时间为2019年11月1日00:00,结束时间为2019年12月1日00:00。
我需要这样做,以便不包括起始边界,因为它不被视为数据集的一部分,例如。从2019年11月1日00:15(下一行)开始,结束日期仍为2019年1月1日00:00。
我想知道Oracle是否可以做到这一点。
我想象代码看起来像这样:
SELECT AVG(data_value) OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN INTERVAL '1' MONTH (+ 1 ROW) PRECEDING AND CURRENT ROW)
我已经尝试了几种变体,但是Oracle不喜欢它们。任何帮助将不胜感激。
答案 0 :(得分:0)
使用以下方法计算上个月的天数:
EXTRACT( DAY FROM TRUNC( time_stamp, 'MM' ) - 1 )
使用NUMTODSINTERVAL
函数可以减少间隔几天,因此您可以排除正在计算的多余日期:
SELECT id,
data_value,
time_stamp,
AVG(data_value)
OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN NUMTODSINTERVAL(
EXTRACT( DAY FROM TRUNC( time_stamp, 'MM' ) - 2 ),
'DAY'
) PRECEDING
AND CURRENT ROW
) AS avg_value_month_minus_1_day
FROM table_name;
因此,如果您的数据是:
CREATE TABLE table_name ( id, data_value, time_stamp ) AS
SELECT 1,
LEVEL,
DATE '2020-01-01' + LEVEL - 1
FROM DUAL
CONNECT BY LEVEL <= 50;
然后将上述查询与您的输出进行比较:
SELECT id,
data_value,
time_stamp,
AVG(data_value)
OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN NUMTODSINTERVAL(
EXTRACT( DAY FROM TRUNC( time_stamp, 'MM' ) - 2 ),
'DAY'
) PRECEDING
AND CURRENT ROW
) AS avg_value_month_minus_1_day,
AVG(data_value)
OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN INTERVAL '1' MONTH PRECEDING
AND CURRENT ROW
) AS avg_value_month
FROM table_name;
输出(对于2月份,当前一个月有完整的数据时):
ID | DATA_VALUE | TIME_STAMP | AVG_VALUE_MONTH_MINUS_1_DAY | AVG_VALUE_MONTH -: | ---------: | :------------------ | --------------------------: | --------------: 1 | 32 | 2020-02-01 00:00:00 | 17 | 16.5 1 | 33 | 2020-02-02 00:00:00 | 18 | 17.5 1 | 34 | 2020-02-03 00:00:00 | 19 | 18.5 1 | 35 | 2020-02-04 00:00:00 | 20 | 19.5 1 | 36 | 2020-02-05 00:00:00 | 21 | 20.5 1 | 37 | 2020-02-06 00:00:00 | 22 | 21.5 1 | 38 | 2020-02-07 00:00:00 | 23 | 22.5 1 | 39 | 2020-02-08 00:00:00 | 24 | 23.5 1 | 40 | 2020-02-09 00:00:00 | 25 | 24.5 1 | 41 | 2020-02-10 00:00:00 | 26 | 25.5 1 | 42 | 2020-02-11 00:00:00 | 27 | 26.5 1 | 43 | 2020-02-12 00:00:00 | 28 | 27.5 1 | 44 | 2020-02-13 00:00:00 | 29 | 28.5 1 | 45 | 2020-02-14 00:00:00 | 30 | 29.5 1 | 46 | 2020-02-15 00:00:00 | 31 | 30.5 1 | 47 | 2020-02-16 00:00:00 | 32 | 31.5 1 | 48 | 2020-02-17 00:00:00 | 33 | 32.5 1 | 49 | 2020-02-18 00:00:00 | 34 | 33.5 1 | 50 | 2020-02-19 00:00:00 | 35 | 34.5
db <>提琴here
答案 1 :(得分:0)
A,Oracle不支持间隔两个月或更小的单位。
一种方法是将其减去:
select (sum(data_value) over (partition by id
order by time_stamp
range between interval '3' month preceding and current row
) -
sum(data_value) over (partition by id
order by time_stamp
range between interval '3' month preceding and '3' month preceding
)
) /
(count(data_value) over (partition by id
order by time_stamp
range between interval '3' month preceding and current row
) -
count(data_value) over (partition by id
order by time_stamp
range between interval '3' month preceding and '3' month preceding
)
)
诚然,这对于平均而言比较麻烦,但是对于sum()
或count()
来说可能就很好。
答案 2 :(得分:0)
要移动您正在查看的时间范围,可以将要排序的值移动适当的时间间隔:
SELECT AVG(data_value)
OVER (PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN INTERVAL '1' MONTH PRECEDING AND CURRENT ROW
) Current_Calc
, AVG(data_value)
OVER (PARTITION BY id
ORDER BY time_stamp - interval '15' minute
RANGE BETWEEN INTERVAL '1' MONTH PRECEDING AND CURRENT ROW
) Shift_Back
, AVG(data_value)
OVER (PARTITION BY id
ORDER BY time_stamp + interval '15' minute
RANGE BETWEEN INTERVAL '1' MONTH PRECEDING AND CURRENT ROW
) shift_forward
FROM Your_Data
基于对问题的描述,我相信您希望将其后移15分钟,但我可能会误读问题说明,并且没有适当的数据可用于测试和预期的结果 < / strong>
这些滑动窗口相对于当前time_stamp
总是包含一个月的数据,这意味着每time_stamp
个月您将获得29至32天的数据,其中包括该数据在当前和前几个月的平均值中进行计数。
另一方面,如果您感兴趣的是离散月份的平均值,那么您应该按月份进行分区,而不是创建一个滑动窗口,如果您希望每月运行的平均值可以添加排序,但是您成功了不需要windowing子句:
SELECT TRUNC(time_stamp, 'MM') MON
, AVG(data_value)
OVER (PARTITION BY id, TRUNC(time_stamp, 'MM')) MON_AVG
, AVG(data_value)
OVER (PARTITION BY id, TRUNC(time_stamp, 'MM')
ORDER BY time_stamp) RUN_MON_AVG
, TRUNC(time_stamp - INTERVAL '15' MINUTE, 'MM') MON_2
, AVG(data_value)
OVER (PARTITION BY id, TRUNC(time_stamp - INTERVAL '15' MINUTE, 'MM')
) MON_AVG_2
, AVG(data_value)
OVER (PARTITION BY id, TRUNC(time_stamp - INTERVAL '15' MINUTE, 'MM')
ORDER BY time_stamp) RUN_MON_AVG
FROM Your_Data
答案 3 :(得分:0)
感谢您的反馈!我能够根据以上答案汇总所需的答案。这是我使用的代码:
SELECT AVG(data_value) OVER (
PARTITION BY id
ORDER BY time_stamp
RANGE BETWEEN (NUMTODSINTERVAL(EXTRACT( DAY FROM (TRUNC(time_stamp,'MM') - 1) ),'DAY') - NUMTODSINTERVAL(1,'SECOND')) PRECEDING AND CURRENT ROW)
因为我的间隔恰好是一个月,并且我想删除第一个条目,所以我首先按照上面的建议将前一个月转换为以秒为单位的间隔。然后,我从间隔的下限减去一秒钟。这样的效果是使区间的下限为“开放”界限,而将上限为“封闭”界限。
作为一个旁注,我使用了一秒钟是因为我的数据集的周期性不一致,但是它的最小值是三分钟,因此任何小于该值的方法都将起作用。