Big Query中的窗口函数和时间差

时间:2015-11-01 22:37:17

标签: sql time google-bigquery window-functions

我有一个大的查询表定义为:

+----+----------------------------+------------+
| id |            time            |   event    |
+----+----------------------------+------------+
|  1 | 2015-10-01 16:31:48.000000 | signup     |
|  1 | 2015-10-01 16:41:48.000000 | 1_purchase |
|  1 | 2015-10-01 16:51:48.000000 | 2_purchase |
|  2 | 2015-10-01 16:31:48.000000 | signup     |
|  2 | 2015-10-01 16:41:48.000000 | 1_purchase |
|  3 | 2015-10-01 16:31:48.000000 | signup     |
+----+----------------------------+------------+

我想计算每个id组(1,2,3)内的时差,得到一个结果:

+----+----------------------------+------------+-----------------+--+
| id |            time            |   event    | timedifference  |  |
+----+----------------------------+------------+-----------------+--+
|  1 | 2015-10-01 16:31:48.000000 | signup     | -               |  |
|  1 | 2015-10-01 16:41:48.000000 | 1_purchase | 00:10:00.000000 |  |
|  1 | 2015-10-01 16:61:48.000000 | 2_purchase | 00:20:00.000000 |  |
|  2 | 2015-10-01 16:31:48.000000 | signup     | -               |  |
|  2 | 2015-10-01 16:41:48.000000 | 1_purchase | 00:10:00.000000 |  |
|  3 | 2015-10-01 16:31:48.000000 | signup     | no_purchase     |  |
+----+----------------------------+------------+-----------------+--+

经过一番研究,我想我需要使用窗口功能...但我无法找出任何解决方案。 任何帮助都非常感谢! 最好, 诉

2 个答案:

答案 0 :(得分:1)

是的,您可以使用分析窗口函数 - 这是使用FIRST_VALUE分析函数执行此操作的一种方法:

SELECT id, time, event, (time - firsttime) / 60000000 FROM (
SELECT id, time, event, 
       FIRST_VALUE(time) OVER(PARTITION BY id ORDER BY time) AS firsttime FROM
(SELECT 1 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event),
(SELECT 1 id, TIMESTAMP('2015-10-01 16:41:48.000000') time, '1_purchase' event),
(SELECT 1 id, TIMESTAMP('2015-10-01 16:51:48.000000') time, '2_purchase' event),
(SELECT 2 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event),
(SELECT 2 id, TIMESTAMP('2015-10-01 16:41:48.000000') time, '1_purchase' event),
(SELECT 3 id, TIMESTAMP('2015-10-01 16:31:48.000000') time, 'signup' event)
)

答案 1 :(得分:0)

select 
  id, time, event, 
  time(sec_to_timestamp((timestamp_to_sec(timestamp(time)) -     
    timestamp_to_sec(timestamp(prev_time))))) as timedifference,
  (timestamp_to_sec(timestamp(time)) -     
    timestamp_to_sec(timestamp(prev_time)))/60 as timefifference_in_min,

  right('0' + string(datediff(timestamp(time),timestamp(prev_time))),2) + ' ' +
  time(sec_to_timestamp((timestamp_to_sec(timestamp(time)) -     
    timestamp_to_sec(timestamp(prev_time))))) as timedifference_as_dd_hh_mm_ss

from (
  select 
    id, time, event,
    lag(time) over(partition by id order by time) as prev_time
  from (
  select f0_ as id, f1_ as time, f2_ as event from
    (select 1, '2015-10-01 16:31:48.000000', 'signup'),
    (select 1, '2015-10-01 16:41:48.000000', '1_purchase'),
    (select 1, '2015-10-01 16:51:48.000000', '2_purchase'),
    (select 2, '2015-10-01 16:31:48.000000', 'signup'),
    (select 2, '2015-10-01 16:41:48.000000', '1_purchase'),
    (select 3, '2015-10-01 16:31:48.000000', 'signup')
  )
)
order by id, time