如何根据辅助字段匹配日期?

时间:2017-02-17 22:08:28

标签: sql google-bigquery

在Google BigQuery中,我尝试将事件开始时间与结束时间相关联,结束时间定义为事件类型与开始时间的事件类型不匹配的最长时间。

以下是举例说明我的问题:

原始数据集:

Name    Event     Event Type    Datetime
****    ******    **********    ****************
Bob     Tennis    Start         2017-02-17 8:00
Bob     Tennis    Playing       2017-02-17 8:10
Bob     Tennis    Playing       2017-02-17 8:20
Bob     Tennis    Playing       2017-02-17 8:30
Bob     Tennis    Playing       2017-02-17 8:50
Bob     Tennis    Start         2017-02-17 10:00
Bob     Tennis    Playing       2017-02-17 10:30
Bob     Bowling   Start         2017-02-18 2:15
Bob     Bowling   Playing       2017-02-18 2:18

所需的表格:

Name    Event     Start Datetime      End Datetime
****    ******    ****************    ****************
Bob     Tennis    2017-02-17 8:00     2017-02-17 8:50
Bob     Tennis    2017-02-17 10:00    2017-02-17 10:30
Bob     Bowling   2017-02-18 2:15     2017-02-18 2:18

我知道解决方案必须涉及partitionmax函数,但我不确定如何找到事件类型与行的类型不匹配的最大日期时间问题。

1 个答案:

答案 0 :(得分:3)

尝试以下,应该给你一个想法

#standardSQL
SELECT Name, Event, MIN(DateTime) AS StartDateTime, MAX(DateTime) AS EndDateTime
FROM (
  SELECT Name, Event, EventType, DateTime, 
    COUNTIF(EventType = 'Start') OVER(PARTITION BY Name, Event ORDER BY DateTime ) AS grp
  FROM yourTable
)
GROUP BY Name, Event, grp

您可以使用以下虚拟数据进行测试

WITH yourTable AS (
  SELECT 'Bob' AS Name, 'Tennis' AS Event, 'Start' AS EventType, '2017-02-17 08:00' AS DateTime UNION ALL
  SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:10' UNION ALL
  SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:20' UNION ALL
  SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:30' UNION ALL
  SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:50' UNION ALL
  SELECT 'Bob', 'Tennis', 'Start', '2017-02-17 10:00' UNION ALL
  SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 10:30' UNION ALL
  SELECT 'Bob', 'Bowling', 'Start', '2017-02-18 02:15' UNION ALL
  SELECT 'Bob', 'Bowling', 'Playing', '2017-02-18 02:18' 
)