在bigquery中根据一天中的时间将值分为两列

时间:2019-04-25 10:46:09

标签: google-bigquery

每小时记录一次设备的能耗:

+--------------+-----------+-----------------------+
| energy_usage | device_id |  timestamp            |
+--------------+-----------+-----------------------+
| 10           | 1         |  2019-02-12T01:00:00  |
| 16           | 2         |  2019-02-12T01:00:00  |
| 26           | 1         |  2019-03-12T02:00:00  |
| 24           | 2         |  2019-03-12T02:00:00  |
+--------------+-----------+-----------------------+

我的目标是:

  1. 创建两列,一列用于energy_usage_day(上午8点至晚上8点),另一列用于energy_usage_night(晚上8点至上午8点)
  2. 创建每月总计,按device_id分组并汇总能源使用量
  3. 删除每月能耗低于50的行

所以结果可能像这样:

+--------------+------------------+--------------------+-----------+---------+------+
| energy_usage | energy_usage_day | energy_usage_night | device_id |  month  | year |
+--------------+------------------+--------------------+-----------+---------+------+
| 80           | 30               | 50                 | 1         | 2       | 2019 |
| 130          | 60               | 70                 | 2         | 3       | 2019 |
+--------------+------------------+--------------------+-----------+---------+------+

在步骤2中,我将使用

SUM(energy_usage) OVER (PARTITION BY device_id ORDER BY FORMAT_TIMESTAMP("%m", TIMESTAMP(timestamp))) 

但是我不确定如何完成步骤1。甚至在bigquery中有可能吗?

1 个答案:

答案 0 :(得分:1)

使用IF,无需使用OVER

SELECT SUM(energy_usage) energy_usage
  , SUM(IF(EXTRACT(HOUR FROM timestamp) BETWEEN 8 AND 19, energy_usage, 0)) energy_usage_day
  , SUM(IF(EXTRACT(HOUR FROM timestamp) NOT BETWEEN 8 AND 19, energy_usage, 0)) energy_usage_night
  , device_id
  , EXTRACT(MONTH FROM timestamp) month, EXTRACT(YEAR FROM timestamp) year
FROM `data`
GROUP BY device_id, month, year