给出这样的表格,我想在更改为其他状态之前计算每个状态的持续时间:
id state timestamp
1 1 2018-08-17 10:40:00
1 2 2018-08-17 12:40:00
1 1 2018-08-17 14:40:00
2 1 2018-08-17 09:00:00
2 2 2018-08-17 12:00:00
我想要的输出是:
id state date duration
1 1 2018-08-17 2 hours
1 2 2018-08-17 2 hours
1 1 2018-08-17 9 hours 20 minutes (until the end of the day in this case)
2 1 2018-08-17 3 hours
2 2 2018-08-17 12 hours (until the end of the day in this case)
我不确定这在SQL中是否可行。我觉得我必须针对聚合状态和时间戳(按id分组并按ts排序)编写UDF,以输出一个结构数组(id,状态,日期和持续时间)。该数组可以展平。
答案 0 :(得分:3)
以下是用于BigQuery标准SQL
#standardSQL
SELECT id, state,
IFNULL(
TIMESTAMP_DIFF(LEAD(ts) OVER(PARTITION BY id ORDER BY ts), ts, MINUTE),
24*60 - TIMESTAMP_DIFF(ts, TIMESTAMP_TRUNC(ts, DAY), MINUTE)
) AS duration_minutes
FROM `project.dataset.table`
您可以使用问题中的虚拟数据进行上述测试和操作:
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 1 state, TIMESTAMP('2018-08-17 10:40:00') ts UNION ALL
SELECT 1, 2, '2018-08-17 12:40:00' UNION ALL
SELECT 1, 1, '2018-08-17 14:40:00' UNION ALL
SELECT 2, 1, '2018-08-17 09:00:00' UNION ALL
SELECT 2, 2, '2018-08-17 12:00:00'
)
SELECT id, state,
IFNULL(
TIMESTAMP_DIFF(LEAD(ts) OVER(PARTITION BY id ORDER BY ts), ts, MINUTE),
24*60 - TIMESTAMP_DIFF(ts, TIMESTAMP_TRUNC(ts, DAY), MINUTE)
) AS duration_minutes
FROM `project.dataset.table`
-- ORDER BY id, ts
结果如下
Row id state duration_minutes
1 1 1 120
2 1 2 120
3 1 1 560
4 2 1 180
5 2 2 720
如果您需要将输出格式设置为与所显示的问题完全相同,请在下面使用
#standardSQL
SELECT id, state, ts, duration_minutes,
FORMAT('%i hours %i minutes', DIV(duration_minutes, 60), MOD(duration_minutes, 60)) duration
FROM (
SELECT id, state, ts,
IFNULL(
TIMESTAMP_DIFF(LEAD(ts) OVER(PARTITION BY id ORDER BY ts), ts, MINUTE),
24*60 - TIMESTAMP_DIFF(ts, TIMESTAMP_TRUNC(ts, DAY), MINUTE)
) AS duration_minutes
FROM `project.dataset.table`
)
在这种情况下,您的输出将如下所示
Row id state ts duration_minutes duration
1 1 1 2018-08-17 10:40:00 UTC 120 2 hours 0 minutes
2 1 2 2018-08-17 12:40:00 UTC 120 2 hours 0 minutes
3 1 1 2018-08-17 14:40:00 UTC 560 9 hours 20 minutes
4 2 1 2018-08-17 09:00:00 UTC 180 3 hours 0 minutes
5 2 2 2018-08-17 12:00:00 UTC 720 12 hours 0 minutes
当然,您很可能仍需要根据具体情况进行调整-但我认为您有一个很好的开端