SQL记录配对和时间间隔计算

时间:2018-12-25 06:42:36

标签: sql amazon-redshift

我想计算特定记录的时间间隔。 这是我的SQL查询和记录。

select event_timestamp, item_id from my_table where event_type='item_clicked' and (item_id='btnA' or item_id='btnB') and user_id='5afcd689c926dc6b1573d7cbff23aa7e' order by event_timestamp DESC

event_timestamp item_id
2018-08-08 12:39:56 btnA
2018-08-08 12:37:26 btnB
2018-08-08 12:37:09 btnA
2018-08-08 12:36:41 btnB
2018-08-08 12:34:06 btnA
2018-08-08 12:33:56 btnB
2018-08-08 12:30:32 btnB
2018-08-08 12:29:55 btnB
2018-07-13 01:48:17 btnB
2018-07-12 03:31:07 btnA
2018-07-12 01:52:50 btnB
2018-07-11 17:01:56 btnA
2018-07-11 16:32:16 btnA
2018-07-09 06:56:49 btnB

但是,我想计算不同item_id状态更改的时间间隔。

例如,我想知道用户单击btnA时的时间,以及他/她将单击btnB多长时间。 如何通过SQL查询生成这样的表?

from_item_id    to_item_id    total_seconds    average_seconds
btnA            btnB          112256           28064

[Note]
2018/7/11  5:01:56 PM -> 2018/7/12  1:52:50 AM: 31854 seconds
2018/7/12  3:31:07 AM -> 2018/7/13  1:48:17 AM: 80230 seconds
2018/8/8  12:34:06 PM -> 2018/8/8  12:36:41 PM: 155 seconds
2018/8/8  12:37:09 PM -> 2018/8/8  12:37:26 PM: 17 seconds
And the total seconds is 112256, and the average is 28064.

2 个答案:

答案 0 :(得分:1)

您可以尝试在子查询中将LEAD的窗口函数与CASE WHEN一起使用。

然后使用聚合函数sumcount获得结果。

CREATE TABLE my_table(
    event_timestamp TIMESTAMP,
    item_id VARCHAR(50)
);



INSERT INTO my_table VALUES ('2018-08-08 12:39:56','btnA');
INSERT INTO my_table VALUES ('2018-08-08 12:37:26','btnB');
INSERT INTO my_table VALUES ('2018-08-08 12:37:09','btnA');
INSERT INTO my_table VALUES ('2018-08-08 12:36:41','btnB');
INSERT INTO my_table VALUES ('2018-08-08 12:34:06','btnA');
INSERT INTO my_table VALUES ('2018-08-08 12:33:56','btnB');
INSERT INTO my_table VALUES ('2018-08-08 12:30:32','btnB');
INSERT INTO my_table VALUES ('2018-08-08 12:29:55','btnB');
INSERT INTO my_table VALUES ('2018-07-13 01:48:17','btnB');
INSERT INTO my_table VALUES ('2018-07-12 03:31:07','btnA');
INSERT INTO my_table VALUES ('2018-07-12 01:52:50','btnB');
INSERT INTO my_table VALUES ('2018-07-11 17:01:56','btnA');
INSERT INTO my_table VALUES ('2018-07-11 16:32:16','btnA');
INSERT INTO my_table VALUES ('2018-07-09 06:56:49','btnB');

查询#1

SELECT 'btnA' from_item_id,
       'btnB' to_item_id,
       sum(secondDiff) total_seconds, 
       sum(secondDiff) / COUNT(*) average_seconds
FROM (
  SELECT *,
    (CASE WHEN item_id = 'btnA' 
      and 
         LEAD(item_id) OVER(ORDER BY event_timestamp) = 'btnB'
      THEN extract(epoch from (LEAD(event_timestamp) OVER(ORDER BY event_timestamp )- event_timestamp))
      ELSE 0 END) secondDiff   
  FROM my_table
) t1
WHERE seconddiff > 0;

| from_item_id | to_item_id | total_seconds | average_seconds |
| ------------ | ---------- | ------------- | --------------- |
| btnA         | btnB       | 112256        | 28064           |

View on DB Fiddle

答案 1 :(得分:1)

我将使用条件累积最小值来计算下一个b事件时间。这似乎是最简单的方法:

select user_id,
       'btnA' from_item_id,
       'btnB' to_item_id,
       sum(datediff(second, event_timestamp, next_b)) as total_seconds, 
       avg(datediff(second, event_timestamp, next_b)) as average_seconds
from (select t.*,
             min(case when item_id = 'btnB' then event_timestamp end) over (partition by user_id order by event_timestamp desc) as next_b
      from my_table t
     ) t
where item_id = 'btnA'
group by user_id;