我试图查询使用基本重复字段的表来存储这样的数据:
+---+----------+------------+
| i | data.key | data.value |
+---+----------+------------+
| 0 | a | 1 |
| | b | 2 |
| 1 | a | 3 |
| | b | 4 |
| 2 | a | 5 |
| | b | 6 |
| 3 | a | 7 |
| | b | 8 |
+---+----------+------------+
我试图弄清楚如何运行获得结果的查询
+---+----+----+
| i | a | b |
+---+----+----+
| 1 | 4 | 6 |
| 3 | 12 | 14 |
+---+----+----+
其中每一行代表一个非重叠的总和(即i=1
是行i=0
和i=1
的总和)并且数据已被旋转,使得data.key
为现在是一个专栏。
我尽力将this answer转换为使用标准SQL,最后得到:
SELECT
i,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'a') as `a`,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'b') as `b`
FROM
`dataset.testing.dummy`)
这很有用,但我想知道是否有更好的方法可以做到这一点,特别是因为它在尝试使用分析函数时会产生一个特别冗长的查询:
SELECT
i,
SUM(a) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS `a`,
SUM(b) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS `b`
FROM (
SELECT
i,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'a') as `a`,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'b') as `b`
FROM
`dataset.testing.dummy`)
ORDER BY
i;
如何撰写ROW
或RANGE
声明,以便生成的窗口不会重叠。在上一个查询中,我得到了数据的滚动总和,这不是我想做的事情。
+---+----+----+
| i | a | b |
+---+----+----+
| 0 | 1 | 2 |
| 1 | 4 | 6 |
| 2 | 8 | 10 |
| 3 | 12 | 14 |
+---+----+----+
滚动总和为每一行产生一个结果,而我试图减少返回的行数。
答案 0 :(得分:1)
使用临时SQL函数和命名窗口有助于详细说明。不过,我不得不使用另一个子选择在i
之后应用过滤器。这是一个独立的例子:
#standardSQL
CREATE TEMP FUNCTION SumKey(
data ARRAY<STRUCT<key STRING, value INT64>>,
target_key STRING) AS (
(SELECT SUM(value) FROM UNNEST(data) WHERE key = target_key)
);
WITH Input AS (
SELECT
0 AS i,
ARRAY<STRUCT<key STRING, value INT64>>[('a', 1), ('b', 2)] AS data UNION ALL
SELECT 1, ARRAY<STRUCT<key STRING, value INT64>>[('a', 3), ('b', 4)] UNION ALL
SELECT 2, ARRAY<STRUCT<key STRING, value INT64>>[('a', 5), ('b', 6)] UNION ALL
SELECT 3, ARRAY<STRUCT<key STRING, value INT64>>[('a', 7), ('b', 8)]
)
SELECT * FROM (
SELECT
i,
SUM(a) OVER W AS a,
SUM(b) OVER W AS b
FROM (
SELECT
i,
SumKey(data, 'a') AS a,
SumKey(data, 'b') AS b
FROM Input
)
WINDOW W AS (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
)
WHERE MOD(i, 2) = 1
ORDER BY i;
这导致:
+---+----+----+
| i | a | b |
+---+----+----+
| 1 | 4 | 6 |
| 3 | 12 | 14 |
+---+----+----+