我有一个用户状态更改表,例如:
insert_time status
1/1/2017 0:00 AVAILABLE
1/1/2017 0:15 BUSY
1/1/2017 0:30 NOT AVAILABLE
1/1/2017 1:30 AVAILABLE
1/1/2017 3:10 BUSY
1/1/2017 5:00 NOT AVAILABLE
例如:此用户在00:00到00:15之间可用,在00:15到00:30之间忙碌,依此类推。
为了分析我需要将其转换为这种结构的数据:
day hour available minutes not available minutes busy minutes
1/1/2017 0 15 30 15
1/1/2017 1 30 30 0
1/1/2017 2 60 0 0
1/1/2017 3 10 0 50
1/1/2017 4 0 0 60
包含状态未更改的小时数据。
我认为这不是一个简单的PIVOT查询,因为我需要将单行分成几列,包括没有数据的小时。
如何在Oracle SQL查询中执行此操作?
答案 0 :(得分:1)
此类查询的一个解决方案涉及两个部分:类别生成,然后聚合到生成的类别中。
对于您提供的数据,此类解决方案的第一步是按小时显示数据(因为您提供的数据在02:00小时或04:00小时内没有任何事件,要在最终结果中显示这些小时数,可以生成。
第二部分是通过pivot
聚合到每小时桶中,正如Jorge Campos在评论中所提到的那样。
以下是一个例子。
首先创建一个测试表:
CREATE TABLE INSERT_TIME_STATUS(
INSERT_TIME TIMESTAMP,
STATUS VARCHAR2(128)
);
并添加测试数据:
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 00:00:00', 'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 00:15:00', 'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 00:30:00', 'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 01:30:00', 'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 03:10:00', 'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (TIMESTAMP '2017-01-01 05:00:00', 'NOT AVAILABLE');
然后创建查询。这将使用子查询因子来概述此过程的两步性质。
此处的CALENDAR
子因子将生成一天中的每个小时,无论在该小时内是否发生任何记录。
HOUR_CALENDAR
子因子会将每个提供的状态记录分配给特定的小时,并将跨越另一小时的状态切成碎片,因此所有记录都适合一小时范围。
DURATION_IN_STATUS
子因子将计算每个小时内每个状态的活动时间。
最终查询将PIVOT
汇总(SUM
)每小时每个STATUS
处于活动状态的时间。
WITH HOUR_OF_DAY AS (SELECT LEVEL - 1 AS THE_HOUR
FROM DUAL
CONNECT BY LEVEL < 25),
CALENDAR AS (SELECT DAY_START
FROM (
SELECT (TIMESTAMP '2017-01-01 00:00:00' + NUMTODSINTERVAL(DATE_INCREMENT.OFFSET, 'DAY')) AS DAY_START
FROM (SELECT LEVEL - 1 AS OFFSET
FROM DUAL
CONNECT BY LEVEL < 9999) DATE_INCREMENT)
WHERE DAY_START BETWEEN (SELECT MIN(TRUNC(INSERT_TIME_STATUS.INSERT_TIME))
FROM INSERT_TIME_STATUS)
AND (SELECT MAX(TRUNC(INSERT_TIME_STATUS.INSERT_TIME))
FROM INSERT_TIME_STATUS)),
HOUR_CALENDAR AS (
SELECT
TO_CHAR(CALENDAR.DAY_START, 'MM/DD/YYYY') AS THE_DAY,
HOUR_OF_DAY.THE_HOUR,
CALENDAR.DAY_START + NUMTODSINTERVAL(HOUR_OF_DAY.THE_HOUR, 'HOUR') AS HOUR_START,
(SELECT MAX(INSERT_TIME_STATUS.STATUS)
KEEP (DENSE_RANK LAST
ORDER BY INSERT_TIME_STATUS.INSERT_TIME ASC)
FROM INSERT_TIME_STATUS
WHERE INSERT_TIME_STATUS.INSERT_TIME <= DAY_START + NUMTODSINTERVAL(THE_HOUR, 'HOUR')) AS HOUR_START_STATUS
FROM CALENDAR
CROSS JOIN HOUR_OF_DAY),
ALL_HOUR_STATUS AS (
SELECT
HOUR_CALENDAR.THE_DAY,
HOUR_CALENDAR.THE_HOUR,
HOUR_CALENDAR.HOUR_START AS THE_TIME,
HOUR_CALENDAR.HOUR_START_STATUS AS THE_STATUS
FROM HOUR_CALENDAR
UNION ALL
SELECT
HOUR_CALENDAR.THE_DAY,
HOUR_CALENDAR.THE_HOUR,
INSERT_TIME_STATUS.INSERT_TIME AS THE_TIME,
INSERT_TIME_STATUS.STATUS AS THE_STATUS
FROM HOUR_CALENDAR
INNER JOIN INSERT_TIME_STATUS
ON HOUR_CALENDAR.HOUR_START < INSERT_TIME_STATUS.INSERT_TIME
AND HOUR_CALENDAR.THE_HOUR = EXTRACT(HOUR FROM INSERT_TIME_STATUS.INSERT_TIME)),
DURATION_IN_STATUS AS (
SELECT
ALL_HOUR_STATUS.THE_DAY,
ALL_HOUR_STATUS.THE_HOUR,
ALL_HOUR_STATUS.THE_STATUS,
(EXTRACT(HOUR FROM
(COALESCE(LEAD(THE_TIME)
OVER (
PARTITION BY NULL
ORDER BY THE_TIME ASC ), TO_TIMESTAMP(THE_DAY, 'MM/DD/YYYY') + NUMTODSINTERVAL(THE_HOUR + 1, 'HOUR')) - THE_TIME)) * 60)
+
EXTRACT(MINUTE FROM
(COALESCE(LEAD(THE_TIME)
OVER (
PARTITION BY NULL
ORDER BY THE_TIME ASC ), TO_TIMESTAMP(THE_DAY, 'MM/DD/YYYY') + NUMTODSINTERVAL(THE_HOUR + 1, 'HOUR')) - THE_TIME))
AS DURATION_IN_STATUS
FROM ALL_HOUR_STATUS)
SELECT
THE_DAY,
THE_HOUR,
COALESCE(AVAILABLE, 0) AS AVAILABLE,
COALESCE(NOT_AVAILABLE, 0) AS NOT_AVAILABLE,
COALESCE(BUSY, 0) AS BUSY
FROM DURATION_IN_STATUS
PIVOT (SUM(DURATION_IN_STATUS)
FOR THE_STATUS
IN ('AVAILABLE' AS AVAILABLE, 'NOT AVAILABLE' AS NOT_AVAILABLE, 'BUSY' AS BUSY)
)
ORDER BY THE_DAY ASC, THE_HOUR ASC;
结果:
THE_DAY THE_HOUR AVAILABLE NOT_AVAILABLE BUSY
01/01/2017 0 15 30 15
01/01/2017 1 30 30 0
01/01/2017 2 60 0 0
01/01/2017 3 10 0 50
01/01/2017 4 0 0 60
01/01/2017 5 0 60 0
01/01/2017 6 0 60 0
01/01/2017 7 0 60 0
01/01/2017 8 0 60 0
01/01/2017 9 0 60 0
01/01/2017 10 0 60 0
01/01/2017 11 0 60 0
01/01/2017 12 0 60 0
01/01/2017 13 0 60 0
01/01/2017 14 0 60 0
01/01/2017 15 0 60 0
01/01/2017 16 0 60 0
01/01/2017 17 0 60 0
01/01/2017 18 0 60 0
01/01/2017 19 0 60 0
01/01/2017 20 0 60 0
01/01/2017 21 0 60 0
01/01/2017 22 0 60 0
01/01/2017 23 0 60 0
24 rows selected.
此示例查询生成整天的记录。所以NOT AVAILABLE
的最后状态会持续下去。如果要在最后分配状态时停止,可以根据需要调整此行为。
编辑,为响应您的更新以评估channel_id
和user_id
的这些时间,这是另一个示例:
首先创建测试表:
CREATE TABLE INSERT_TIME_STATUS(
USER_ID NUMBER,
CHANNEL_ID NUMBER,
INSERT_TIME TIMESTAMP,
STATUS VARCHAR2(128)
);
并加载它(此处user_id = 1在通道3和4上,user_id = 2仅在chanel 3上):
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 0:00','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 0:15','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 0:30','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 1:30','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 3:10','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,3,TO_TIMESTAMP('1/1/2017 5:00','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 0:00','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 0:15','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 0:30','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 1:30','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 3:10','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (1111,4,TO_TIMESTAMP('1/1/2017 5:00','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 0:00','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 0:15','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 0:30','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 1:30','MM/DD/YYYY HH24:MI'),'AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 3:10','MM/DD/YYYY HH24:MI'),'BUSY');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 5:00','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
INSERT INTO INSERT_TIME_STATUS VALUES (2222,3,TO_TIMESTAMP('1/1/2017 5:00','MM/DD/YYYY HH24:MI'),'NOT AVAILABLE');
然后更新查询以生成数据 - user_id
每 - channel_id
。在此示例中,对于每个用户所涉及的所有通道,始终包括数据。用户1将对频道3
和4
的每个小时计数,而用户2将仅对频道3的每个小时计数(如果它在另一个频道上有记录,该频道也包括在内。)
WITH HOUR_OF_DAY AS (SELECT LEVEL - 1 AS THE_HOUR
FROM DUAL
CONNECT BY LEVEL < 25),
CALENDAR AS (SELECT DAY_START
FROM (
SELECT ((SELECT MIN(TRUNC(INSERT_TIME_STATUS.INSERT_TIME))
FROM INSERT_TIME_STATUS) + NUMTODSINTERVAL(DATE_INCREMENT.OFFSET, 'DAY')) AS DAY_START
FROM (SELECT LEVEL - 1 AS OFFSET
FROM DUAL
CONNECT BY LEVEL < 9999) DATE_INCREMENT)
WHERE DAY_START BETWEEN (SELECT MIN(TRUNC(INSERT_TIME_STATUS.INSERT_TIME))
FROM INSERT_TIME_STATUS)
AND (SELECT MAX(TRUNC(INSERT_TIME_STATUS.INSERT_TIME))
FROM INSERT_TIME_STATUS)),
USER_CHANNEL_HOUR_CALENDAR AS (
SELECT
USER_ID,
CHANNEL_ID,
CALENDAR.DAY_START,
TO_CHAR(CALENDAR.DAY_START, 'MM/DD/YYYY') AS THE_DAY,
HOUR_OF_DAY.THE_HOUR,
CALENDAR.DAY_START + NUMTODSINTERVAL(HOUR_OF_DAY.THE_HOUR, 'HOUR') AS HOUR_START
FROM CALENDAR
CROSS JOIN HOUR_OF_DAY
--
CROSS JOIN (SELECT UNIQUE USER_ID, CHANNEL_ID FROM INSERT_TIME_STATUS)
),
HOUR_CALENDAR AS (
SELECT USER_ID,
CHANNEL_ID,
THE_DAY,
THE_HOUR,
DAY_START,
HOUR_START,
(SELECT MAX(INSERT_TIME_STATUS.STATUS)
KEEP (DENSE_RANK LAST
ORDER BY INSERT_TIME_STATUS.INSERT_TIME ASC)
FROM INSERT_TIME_STATUS
WHERE INSERT_TIME_STATUS.INSERT_TIME <= DAY_START + NUMTODSINTERVAL(THE_HOUR, 'HOUR')
AND INSERT_TIME_STATUS.USER_ID = USER_ID
AND INSERT_TIME_STATUS.CHANNEL_ID = CHANNEL_ID) AS HOUR_START_STATUS
FROM USER_CHANNEL_HOUR_CALENDAR),
ALL_HOUR_STATUS AS (
SELECT
HOUR_CALENDAR.USER_ID,
HOUR_CALENDAR.CHANNEL_ID,
HOUR_CALENDAR.THE_DAY,
HOUR_CALENDAR.THE_HOUR,
HOUR_CALENDAR.HOUR_START AS THE_TIME,
HOUR_CALENDAR.HOUR_START_STATUS AS THE_STATUS
FROM HOUR_CALENDAR
UNION ALL
SELECT
INSERT_TIME_STATUS.USER_ID,
INSERT_TIME_STATUS.CHANNEL_ID,
HOUR_CALENDAR.THE_DAY,
HOUR_CALENDAR.THE_HOUR,
INSERT_TIME_STATUS.INSERT_TIME AS THE_TIME,
INSERT_TIME_STATUS.STATUS AS THE_STATUS
FROM HOUR_CALENDAR
INNER JOIN INSERT_TIME_STATUS
ON HOUR_CALENDAR.HOUR_START < INSERT_TIME_STATUS.INSERT_TIME
AND HOUR_CALENDAR.THE_HOUR = EXTRACT(HOUR FROM INSERT_TIME_STATUS.INSERT_TIME)
AND HOUR_CALENDAR.USER_ID = INSERT_TIME_STATUS.USER_ID
AND HOUR_CALENDAR.CHANNEL_ID = INSERT_TIME_STATUS.CHANNEL_ID),
DURATION_IN_STATUS AS (
SELECT
ALL_HOUR_STATUS.USER_ID,
ALL_HOUR_STATUS.CHANNEL_ID,
ALL_HOUR_STATUS.THE_DAY,
ALL_HOUR_STATUS.THE_HOUR,
ALL_HOUR_STATUS.THE_STATUS,
(EXTRACT(HOUR FROM
(COALESCE(LEAD(THE_TIME)
OVER (
PARTITION BY USER_ID, CHANNEL_ID
ORDER BY THE_TIME ASC ), TO_TIMESTAMP(THE_DAY, 'MM/DD/YYYY') + NUMTODSINTERVAL(THE_HOUR + 1, 'HOUR')) - THE_TIME)) * 60)
+
EXTRACT(MINUTE FROM
(COALESCE(LEAD(THE_TIME)
OVER (
PARTITION BY USER_ID, CHANNEL_ID
ORDER BY THE_TIME ASC ), TO_TIMESTAMP(THE_DAY, 'MM/DD/YYYY') + NUMTODSINTERVAL(THE_HOUR + 1, 'HOUR')) - THE_TIME))
AS DURATION_IN_STATUS
FROM ALL_HOUR_STATUS)
SELECT
USER_ID,
CHANNEL_ID,
THE_DAY,
THE_HOUR,
COALESCE(AVAILABLE, 0) AS AVAILABLE,
COALESCE(NOT_AVAILABLE, 0) AS NOT_AVAILABLE,
COALESCE(BUSY, 0) AS BUSY
FROM DURATION_IN_STATUS
PIVOT (SUM(DURATION_IN_STATUS)
FOR THE_STATUS
IN ('AVAILABLE' AS AVAILABLE, 'NOT AVAILABLE' AS NOT_AVAILABLE, 'BUSY' AS BUSY)
)
-- You can additionally filter the result
-- WHERE CHANNEL_ID IN (3,4)
-- WHERE USER_ID = 12345
-- WHERE THE_DAY > TO_CHAR(DATE '2017-01-01')
-- etc.
ORDER BY USER_ID ASC, CHANNEL_ID ASC, THE_DAY ASC, THE_HOUR ASC;
然后测试一下:
USER_ID CHANNEL_ID THE_DAY THE_HOUR AVAILABLE NOT_AVAILABLE BUSY
1111 3 01/01/2017 0 15 30 15
1111 3 01/01/2017 1 30 30 0
1111 3 01/01/2017 2 60 0 0
1111 3 01/01/2017 3 10 0 50
1111 3 01/01/2017 4 0 0 60
1111 3 01/01/2017 5 0 60 0
1111 3 01/01/2017 6 0 60 0
...
1111 3 01/01/2017 23 0 60 0
1111 4 01/01/2017 0 15 30 15
1111 4 01/01/2017 1 30 30 0
1111 4 01/01/2017 2 60 0 0
1111 4 01/01/2017 3 10 0 50
1111 4 01/01/2017 4 0 0 60
1111 4 01/01/2017 5 0 60 0
1111 4 01/01/2017 6 0 60 0
...
1111 4 01/01/2017 23 0 60 0
2222 3 01/01/2017 0 15 30 15
2222 3 01/01/2017 1 30 30 0
2222 3 01/01/2017 2 60 0 0
2222 3 01/01/2017 3 10 0 50
2222 3 01/01/2017 4 0 0 60
2222 3 01/01/2017 5 0 60 0
2222 3 01/01/2017 6 0 60 0