将数据处理成30分钟的值

时间:2018-07-11 18:00:19

标签: sql oracle plsql

我具有以下格式的数据集-

ID  START_TIME              END_TIME                VAL

1   30-APR-2018 00:00:00    01-MAY-2018 00:00:00    423
2   01-MAY-2018 00:00:00    01-MAY-2018 17:15:00    455
3   01-MAY-2018 17:15:00    03-MAY-2018 00:00:00    455
  

预期输出-

该数据集应细分为30分钟的间隔值,但是,如果记录不在“ 00”或“ 30”分钟点,则应将其视为此过程的一部分(如带有START_TIME / END_TIME = '17:15:00')

ID  START_TIME              END_TIME                VAL

1   30-APR-2018 00:00:00    30-APR-2018 00:30:00    423
1   30-APR-2018 00:30:00    30-APR-2018 01:00:00    423
1   30-APR-2018 01:00:00    30-APR-2018 01:30:00    423
..
..
..

1   30-APR-2018 23:00:00    30-APR-2018 23:30:00    423
1   30-APR-2018 23:30:00    01-MAY-2018 00:00:00    423
2   01-MAY-2018 00:00:00    01-MAY-2018 00:30:00    455
2   01-MAY-2018 00:30:00    01-MAY-2018 01:00:00    455
..
..
..
..
2   01-MAY-2018 16:30:00    01-MAY-2018 17:00:00    455
2   01-MAY-2018 17:00:00    01-MAY-2018 17:15:00    455
3   01-MAY-2018 17:15:00    03-MAY-2018 17:30:00    455
3   01-MAY-2018 17:30:00    03-MAY-2018 18:00:00    455
..
..
..
3   02-MAY-2018 23:00:00    02-MAY-2018 23:30:00    455
3   02-MAY-2018 23:30:00    03-MAY-2018 00:00:00    455

到目前为止我尝试过的-

CREATE TABLE TESTT
( 
    ID NUMBER(8,3),
    START_TIME DATE,
    END_TIME DATE,
    VAL NUMBER(8,3)
);

INSERT INTO TESTT VALUES (1, TO_DATE('30-APR-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'),  TO_DATE('01-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), 423);
INSERT INTO TESTT VALUES (2, TO_DATE('01-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'),  TO_DATE('01-MAY-2018 17:15:00','DD-MON-YYYY HH24:MI:SS'), 455);
INSERT INTO TESTT VALUES (3, TO_DATE('01-MAY-2018 17:15:00','DD-MON-YYYY HH24:MI:SS'),  TO_DATE('03-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), 455);
COMMIT;

CREATE TABLE TESTT_OUTPUT AS
SELECT * FROM TESTT WHERE 1=2;

CREATE SEQUENCE TESTT_SEQ MINVALUE 1 MAXVALUE 9999999999999999999999999999 INCREMENT BY 1 START WITH 1 NOCACHE NOORDER NOCYCLE NOPARTITION;


BEGIN

FOR R IN (SELECT * FROM TESTT)
LOOP
    INSERT INTO TESTT_OUTPUT(id, START_TIME, END_TIME, VAL)
    SELECT TESTT_SEQ.nextval, R.START_TIME + (LEVEL - 1)/48 AS START_TIME, R.START_TIME + LEVEL/48 AS END_TIME, R.VAL FROM
    DUAL
    CONNECT BY LEVEL <= ROUND((R.END_TIME - R.START_TIME)*48);

    COMMIT;
END LOOP;

END;
/


SELECT * FROM TESTT_OUTPUT;




1   30-APR-2018 00:00:00    30-APR-2018 00:30:00    423
2   30-APR-2018 00:30:00    30-APR-2018 01:00:00    423
3   30-APR-2018 01:00:00    30-APR-2018 01:30:00    423
..
..
..
47  30-APR-2018 23:00:00    30-APR-2018 23:30:00    423
48  30-APR-2018 23:30:00    01-MAY-2018 00:00:00    423
49  01-MAY-2018 00:00:00    01-MAY-2018 00:30:00    455
50  01-MAY-2018 00:30:00    01-MAY-2018 01:00:00    455
..
..
..
82  01-MAY-2018 16:30:00    01-MAY-2018 17:00:00    455
83  01-MAY-2018 17:00:00    01-MAY-2018 17:30:00    455
84  01-MAY-2018 17:15:00    01-MAY-2018 17:45:00    455
85  01-MAY-2018 17:45:00    01-MAY-2018 18:15:00    455
86  01-MAY-2018 18:15:00    01-MAY-2018 18:45:00    455
87  01-MAY-2018 18:45:00    01-MAY-2018 19:15:00    455
..
..
..
141 02-MAY-2018 21:45:00    02-MAY-2018 22:15:00    455
142 02-MAY-2018 22:15:00    02-MAY-2018 22:45:00    455
143 02-MAY-2018 22:45:00    02-MAY-2018 23:15:00    455
144 02-MAY-2018 23:15:00    02-MAY-2018 23:45:00    455
145 02-MAY-2018 23:45:00    03-MAY-2018 00:15:00    455

通过这种方法,分钟值除“ 00”或“ 30”以外的任何数据仍将通过添加30分钟的方式进行相同处理,并且最终结果中没有“ 00”的时间点数据或“ 30”分钟的值。

希望这很有道理。

任何有关如何以预期格式转换数据的输入将非常有帮助。谢谢!

1 个答案:

答案 0 :(得分:2)

这似乎不太雅致,但这;

select id,
  greatest(start_time,
    adj_start_time + numtodsinterval(30 * (level - 1), 'MINUTE')) as start_time,
  least(end_time,
    adj_start_time + numtodsinterval(30 * level, 'MINUTE')) as end_time
from (
  select id,
    start_time,
    end_time,
    trunc(start_time, 'HH')
      + numtodsinterval(
          case when extract(minute from cast(start_time as timestamp)) < 30 then 0
               else 30
          end, 'MINUTE') as adj_start_time
  from testt
)
connect by level <= ceil((end_time - start_time - 1/86400) / (30/1440))
and prior id = id
and prior dbms_random.value is not null
order by id, start_time;

似乎获得所需的结果,生成145行:

        ID START_TIME          END_TIME           
---------- ------------------- -------------------
         1 2018-04-30 00:00:00 2018-04-30 00:30:00
         1 2018-04-30 00:30:00 2018-04-30 01:00:00
         1 2018-04-30 01:00:00 2018-04-30 01:30:00
...
         1 2018-04-30 22:30:00 2018-04-30 23:00:00
         1 2018-04-30 23:00:00 2018-04-30 23:30:00
         1 2018-04-30 23:30:00 2018-05-01 00:00:00
         2 2018-05-01 00:00:00 2018-05-01 00:30:00
         2 2018-05-01 00:30:00 2018-05-01 01:00:00
         2 2018-05-01 01:00:00 2018-05-01 01:30:00
...
         2 2018-05-01 16:00:00 2018-05-01 16:30:00
         2 2018-05-01 16:30:00 2018-05-01 17:00:00
         2 2018-05-01 17:00:00 2018-05-01 17:15:00
         3 2018-05-01 17:15:00 2018-05-01 17:30:00
         3 2018-05-01 17:30:00 2018-05-01 18:00:00
         3 2018-05-01 18:00:00 2018-05-01 18:30:00
...
         3 2018-05-02 22:30:00 2018-05-02 23:00:00
         3 2018-05-02 23:00:00 2018-05-02 23:30:00
         3 2018-05-02 23:30:00 2018-05-03 00:00:00

内联视图获取实数列以及开始的名义30分钟窗口-即,对于17:15,它获得17:00,如adj_start_time。分层查询为此增加了30分钟的间隔,并且如果leastgreatest不在半小时内,则使用它们来获取原始的开始/结束时间。

对于插入,您可以将原始ID替换为解析的row_number()而不是使用序列,并添加val

insert into testt_output(id, start_time, end_time, val)
select row_number() over (order by id, level),
  greatest(start_time,
    adj_start_time + numtodsinterval(30 * (level - 1), 'MINUTE')) as start_time,
  least(end_time,
    adj_start_time + numtodsinterval(30 * level, 'MINUTE')) as end_time,
  val
from (
  select id,
    start_time,
    end_time,
    val,
    trunc(start_time, 'HH')
      + numtodsinterval(
          case when extract(minute from cast(start_time as timestamp)) < 30 then 0
               else 30
          end, 'MINUTE') as adj_start_time
  from testt
)
connect by level <= ceil((end_time - start_time - 1/86400) / (30/1440))
and prior id = id
and prior dbms_random.value is not null;

145 rows inserted.

select * from testt_output;

        ID START_TIME          END_TIME                   VAL
---------- ------------------- ------------------- ----------
         1 2018-04-30 00:00:00 2018-04-30 00:30:00        423
         2 2018-04-30 00:30:00 2018-04-30 01:00:00        423
...
        47 2018-04-30 23:00:00 2018-04-30 23:30:00        423
        48 2018-04-30 23:30:00 2018-05-01 00:00:00        423
        49 2018-05-01 00:00:00 2018-05-01 00:30:00        455
        50 2018-05-01 00:30:00 2018-05-01 01:00:00        455
...
        82 2018-05-01 16:30:00 2018-05-01 17:00:00        455
        83 2018-05-01 17:00:00 2018-05-01 17:15:00        455
        84 2018-05-01 17:15:00 2018-05-01 17:30:00        455
        85 2018-05-01 17:30:00 2018-05-01 18:00:00        455
...
       144 2018-05-02 23:00:00 2018-05-02 23:30:00        455
       145 2018-05-02 23:30:00 2018-05-03 00:00:00        455

db<>fiddle demo