我在Bigquery中有一个表格,该表格每30分钟显示一次数据,我想每5分钟显示一次数据,目前我正在使用此查询用现有值填充空值
SELECT
SETTLEMENTDATE,DUID,
LAST_VALUE(SCADAVALUE ignore nulls) OVER (
PARTITION BY DUID ORDER BY SETTLEMENTDATE) AS SCADAVALUE from x
相反,可以进行线性插值吗?
我的列结算日期为5分钟,SCADAVALUEORIGIN列的值非常为30分钟,否则为空,我想添加一列SCADAINTERPOLATION,该值在30的两个值之间均匀分布分钟,另一个问题是,当我每5分钟刷新一次数据时,最后一个值将在(5,10,15,20,25)分钟内显示为空,我希望我的解释很清楚
答案 0 :(得分:1)
我可以推测您想要这样的东西:
select timestamp_add(t.ts, interval min minute),
(val * (30 - min) +
lead(val) over (order by ts) * min
) / 30
from t cross join
unnest(generate_array(0, 25, 5)) min;
答案 1 :(得分:1)
以下是BigQuery标准SQL
#standardSQL
SELECT
TIMESTAMP_ADD(SETTLEMENTDATE, INTERVAL 5 * i MINUTE) AS SETTLEMENTDATE,
IF(i = 0, SCADAVALUEORIGIN, NULL) AS SCADAVALUEORIGIN,
SCADAVALUEORIGIN AS SCADAVALUE,
ROUND(SCADAVALUEORIGIN + IFNULL((next_value - SCADAVALUEORIGIN) / 6 * i, 0), 3) AS SCADAINTERPOLATION
FROM (
SELECT SETTLEMENTDATE, SCADAVALUEORIGIN,
LEAD(SCADAVALUEORIGIN) OVER(ORDER BY SETTLEMENTDATE) next_value,
FROM `project.dataset.table`
), UNNEST(GENERATE_ARRAY(0, 5)) i
如果要应用于您的问题的样本数据-结果为