BigQuery中的SUM循环

时间:2018-11-24 19:23:00

标签: google-bigquery

BigQuery中是否可能进行此类汇总?我有两个字段– datetime和value(float64)。表中每隔10分钟就会发布一个值:

-----------------------------------
| datetime              | value   |
-----------------------------------
| 2018-11-01T09:00:05   | 1.1     |
| 2018-11-01T09:10:01   | 1.2     |
| 2018-11-01T09:20:59   | 2.4     |
| 2018-11-01T09:30:18   | 0.8     |
| ...                   | ...     |
| 2018-11-21T22:50:04   | 2.1     |
| ...                   | ...     |
| 2018-11-30T23:59:59   | 4.2     |
-----------------------------------

是否有一种方法可以获取包含日期和从开始到特定日期的所有先前值的总和的汇总表? 例如。一个月内将有31(或30)个日期行,每天的值行将具有所有先前值的总和:

-----------------------------------------------------------------------
| date                  | value                                       |
-----------------------------------------------------------------------
| 2018-11-01            | SUM of all values 2018-11-01...2018-11-01   |
| 2018-11-02            | SUM of all values 2018-11-01...2018-11-02   |
| 2018-11-03            | SUM of all values 2018-11-01...2018-11-03   |
| 2018-11-04            | SUM of all values 2018-11-01...2018-11-04   |
| ...                   | ...                                         |
| 2018-11-20            | SUM of all values 2018-11-01...2018-11-20   |
| ...                   | ...                                         |
| 2018-11-30            | SUM of all values 2018-11-01...2018-11-30   |
-----------------------------------------------------------------------

2 个答案:

答案 0 :(得分:1)

以下是BigQuery标准SQL的-您首先按天分组并汇总当天的所有值,然后应用开窗函数以获取最终结果

a="global"
local b="local"

-- load() actually eliminates all upvalues :-)
no_problem = assert(load[[
    return function (_ENV)
        a="fn_a"
        b="fn_b"
    end
]])()

no_problem{}
print(_VERSION)
print("a",a)  -->  a    global
print("b",b)  -->  b    local

如果您需要每个月“重置”总和-您可以在下面使用

#standardSQL
SELECT 
  day, SUM(value) OVER(ORDER BY day) value
FROM (
  SELECT DATE(dt) day, SUM(value) value
  FROM `project.dataset.table`
  GROUP BY day
)

答案 1 :(得分:0)

BigQuery CTE通常有助于使事情更容易理解。在您的datetime值的情况下,这应该起作用:

with datevals as (
  select date(datetime) as date, sum(value) as value from `dataset.table` group by 1
)
select a.date as dt, sum((select sum(b.value) from datevals b where b.date <= a.date )) as value
from datevals a
group by 1
order by 1