bigquery时间序列数据

时间:2018-08-20 13:38:58

标签: sql google-bigquery standard-sql

我有一张这样的桌子。该行包括时间戳记,并计为当时的值度量。

Row timestamp count
1 2018-08-20 04:01:39.108497 31
2 2018-08-20 04:01:45.109497 45
3 2018-08-20 04:01:49.109497 44
4 2018-08-20 04:02:39.102497 33
5 2018-08-20 04:02:45.101497 41
6 2018-08-20 04:02:49.103497 22
7 2018-08-20 04:03:39.102497 23
8 2018-08-20 04:03:45.102497 42
9 2018-08-20 04:03:49.103497 41

我想将其作为avg(count)的分钟级聚合汇总到此

Row timestamp count
1 2018-08-20 04:01:00 40
2 2018-08-20 04:02:00 32
3 2018-08-20 04:03:00 35

请帮助。预先感谢

2 个答案:

答案 0 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT TIMESTAMP_TRUNC(ts, MINUTE) dt, CAST(AVG(cnt) AS INT64) viewCount
FROM `project.dataset.table`
GROUP BY dt

如果要对您的问题中的虚拟数据进行以下操作

#standardSQL
WITH `project.dataset.table` AS (
  SELECT TIMESTAMP '2018-08-20 04:01:39.108497' ts, 31 cnt UNION ALL
  SELECT '2018-08-20 04:01:45.109497', 45 UNION ALL
  SELECT '2018-08-20 04:01:49.109497', 44 UNION ALL
  SELECT '2018-08-20 04:02:39.102497', 33 UNION ALL
  SELECT '2018-08-20 04:02:45.101497', 41 UNION ALL
  SELECT '2018-08-20 04:02:49.103497', 22 UNION ALL
  SELECT '2018-08-20 04:03:39.102497', 23 UNION ALL
  SELECT '2018-08-20 04:03:45.102497', 42 UNION ALL
  SELECT '2018-08-20 04:03:49.103497', 41 
)
SELECT TIMESTAMP_TRUNC(ts, MINUTE) dt, CAST(AVG(cnt) AS INT64) viewCount
FROM `project.dataset.table`
GROUP BY dt
-- ORDER BY dt

结果是

Row dt                      viewCount
1   2018-08-20 04:01:00 UTC 40   
2   2018-08-20 04:02:00 UTC 32   
3   2018-08-20 04:03:00 UTC 35   

答案 1 :(得分:0)

只需使用TIMESTAMP_TRUNC()

select timestamp_trunc(minute, timestamp) as timestamp_min,
       sum(count)  -- or whatever aggregation you want
from t
group by timestamp_min;

您不清楚您想要什么汇总的问题。例如,数据中未出现“ 35”。