我以为避免在Big查询中使用以下代码避免使用多个LEFT JOIN和GROUP BY是很不错的:
WITH added_a_boolean_column AS (
SELECT
*,
NOT (DATE(CodedDate) >= "2018-04-01"
AND DATE(CodedDate) < "2018-04-14") AS train
FROM
`XXXXX` )
SELECT
countif(train) OVER (PARTITION BY a) as a_counts,
countif(train) OVER (PARTITION BY b) as b_counts,
countif(train) OVER (PARTITION BY c) as c_counts,
countif(train) OVER (PARTITION BY d) as d_counts,
countif(train) OVER (PARTITION BY e) as e_counts,
countif(train) OVER (PARTITION BY f) as f_counts,
countif(train) OVER (PARTITION BY g) as g_counts,
countif(train) OVER (PARTITION BY h) as h_counts,
countif(train) OVER (PARTITION BY i) as i_counts
FROM added_a_boolean_column
但是...它导致以下错误:
Resources exceeded during query execution: The query could not be executed in the allotted memory. Peak usage: 152% of limit. Top memory consumer(s): sort operations used for analytic OVER() clauses: 99% other/unattributed: 1% .
到底是怎么回事?是:
WITH added_a_boolean_column AS (
SELECT
*,
NOT (DATE(CodedDate) >= "2018-04-01"
AND DATE(CodedDate) < "2018-04-14") AS train
FROM
`XXXXX` ),
a_count as (
SELECT count(*) as a_counts, a FROM added_a_boolean_column WHERE train GROUP BY a),
b_count as (.....
....
....
i_count as (..
SELECT * FROM added_a_boolean_column LEFT JOIN a_count.....
更好的选择?