多个分析功能占用大查询中的所有内存

时间:2018-11-13 21:39:52

标签: sql google-bigquery

我以为避免在Big查询中使用以下代码避免使用多个LEFT JOIN和GROUP BY是很不错的:

 WITH added_a_boolean_column AS (
      SELECT
        *,
        NOT (DATE(CodedDate) >= "2018-04-01"
          AND DATE(CodedDate) < "2018-04-14") AS train
      FROM
      `XXXXX` )


SELECT 
   countif(train) OVER (PARTITION BY a) as a_counts,
   countif(train) OVER (PARTITION BY b) as b_counts,
   countif(train) OVER (PARTITION BY c) as c_counts,
   countif(train) OVER (PARTITION BY d) as d_counts,
   countif(train) OVER (PARTITION BY e) as e_counts,
   countif(train) OVER (PARTITION BY f) as f_counts,
   countif(train) OVER (PARTITION BY g) as g_counts,
   countif(train) OVER (PARTITION BY h) as h_counts,
   countif(train) OVER (PARTITION BY i) as i_counts
 FROM added_a_boolean_column

但是...它导致以下错误:

Resources exceeded during query execution: The query could not be executed in the allotted memory. Peak usage: 152% of limit. Top memory consumer(s): sort operations used for analytic OVER() clauses: 99% other/unattributed: 1% .

到底是怎么回事?是:

 WITH added_a_boolean_column AS (
      SELECT
        *,
        NOT (DATE(CodedDate) >= "2018-04-01"
          AND DATE(CodedDate) < "2018-04-14") AS train
      FROM
      `XXXXX` ),

    a_count as (
        SELECT count(*) as a_counts, a FROM added_a_boolean_column WHERE train GROUP BY a),
    b_count as (.....
    ....
    ....

    i_count as (..

    SELECT * FROM added_a_boolean_column LEFT JOIN a_count.....

更好的选择?

0 个答案:

没有答案