模拟一系列日期的查询

时间:2017-05-29 14:02:35

标签: sql oracle plsql

我有一个相当长的查询,查看过去13周,并确定当前一天的表现是否与过去13周相比是异常。它只返回一行,其中包含日期,当前日期的性能以及一个标记,表明它是否是异常。让事情变得更复杂:性能不仅仅是一天,而是一个24小时运行的窗口。然后每小时运行此查询以监视过去24小时内的KPI。即如果是星期二下午2点,它将从前一天(星期一)下午2点开始到现在,并在过去13周内将其与下午2点至下午2点进行比较。

要测试此代码是否正常工作,我想模拟它在过去一个月内运行。

代码如下:

WITH performance AS(
    SELECT TRUNC(dateColumn - to_number(to_char(sysdate, 'hh24')/24) as startdate,
           KPI_a,
           KPI_b,
           KPI_c
    FROM table
    WHERE someConditions
    GROUP BY TRUNC(dateColumn - to_number(to_char(sysdate, 'hh24')/24)),
compare_t AS(
    -- looks at relationships of the KPIs),
variables AS(
    -- calculates the variables required for the anomaly detection),

...好吧我不知道需要提供多少查询,但基本上我需要模拟'sysdate'。而不是输入当前日期,输入上个月的每小时,因此该查询将运行大约720次并且每天的每小时返回结果720次。

我在想一个FOR循环,但我不确定。

2 个答案:

答案 0 :(得分:0)

您可以使用递归子查询:

with times(time) as
(
  select sysdate - interval '1' month as time from dual
  union all
  select time + interval '1' hour from times
  where time < sysdate
)
, performance as ()
, compare_t as ()
, variables as ()
select * 
from times
join ...
order by time;

答案 1 :(得分:0)

我不明白你的具体要求,但我必须解决类似的问题。在这里给你一个想法是两个建议:

计算过去13周至昨天的KPI值的平均值和标准差。如果今天的当前值低于“AVG - 10 * STDDEV”,则选择记录,即标记为异常。

WITH t AS 
    (SELECT dateColumn, KPI_A,  
        AVG(KPI_A)    OVER (ORDER BY dateColumn RANGE BETWEEN 13 * INTERVAL '7' DAY PRECEDING AND INTERVAL '1' DAY PRECEDING) AS REF_AVG,
        STDDEV(KPI_A) OVER (ORDER BY dateColumn RANGE BETWEEN 13 * INTERVAL '7' DAY PRECEDING AND INTERVAL '1' DAY PRECEDING) AS REF_STDDEV 
    FROM TABLE
    WHERE someConditions)
SELECT dateColumn, REF_AVG, KPI_A, REF_STDDEV
FROM t 
WHERE TRUNC(dateColumn, 'HH') = TRUNC(LOCALTIMESTAMP, 'HH')
    AND KPI_A < REF_AVG - 10 * REF_STDDEV;

从上周(即与昨天相同的工作日)获取每小时值,并与昨天的每小时值进行相关。如果相关性小于某个值(我使用95%),那么将这一天视为异常。

WITH t AS
    (SELECT dateColumn, KPI_A,      
        FIRST_VALUE(KPI_A) OVER (ORDER BY dateColumn RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW) AS KPI_A_LAST_WEEK,
        dateColumn - FIRST_VALUE(dateColumn) OVER (ORDER BY dateColumn RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW) AS RANGE_INT
    FROM table
    WHERE ...)
SELECT 100*ROUND(CORR(KPI_A, KPI_A_LAST_WEEK), 2) AS CORR_VAL
FROM t
WHERE KPI_A_LAST_WEEK IS NOT NULL
    AND RANGE_INT = INTERVAL '7' DAY
    AND TRUNC(dateColumn) = TRUNC(LOCALTIMESTAMP - INTERVAL '1' DAY)
GROUP BY TRUNC(dateColumn);