将ARRAY_AGG()与输入中的所有空值一起使用

时间:2016-10-04 05:57:07

标签: sql google-bigquery

我想为表格中的每个用户选择不同的列值(在Google BigQuery中)。我曾考虑使用ARRAY_AGG()之类的:

SELECT user_id, ARRAY_AGG(DISTINCT field1) AS f1, ARRAY_AGG(DISTINCT field2) AS f2
FROM t GROUP BY user_id

但是,对于某些user_idfield1field2中的所有值都为空,我收到以下错误消息:Array 'f1' cannot have a null element

我想知道是否有一种解决方法可以避免此错误,或者可能采用不同的方式来实现结果而不使用ARRAY_AGG()

4 个答案:

答案 0 :(得分:4)

来自https://cloud.google.com/bigquery/sql-reference/data-types#array-type

  

如果查询结果包含包含NULL的ARRAY,则BigQuery会引发错误   元素,虽然可以在查询中使用这样的ARRAY。

您的查询在临时查询中是可以的,但在查询结果中不正确;因此,解决方法是您可以将查询定义为tmp表并在给出最终结果之前过滤掉NULL值:

WITH tmp AS (SELECT user_id,
                    ARRAY_AGG(DISTINCT field1) AS f1,
                    ARRAY_AGG(DISTINCT field2) AS f2
FROM t GROUP BY user_id)

SELECT user_id,
    ARRAY(SELECT el FROM UNNEST(f1) AS el WHERE el IS NOT NULL) AS f1,
    ARRAY(SELECT el FROM UNNEST(f2) AS el WHERE el IS NOT NULL) AS f2
 FROM tmp

我将一些Postgres SQL移植到BigQuery时遇到了同样的问题,更优雅的解决方案是聚合函数的FILTER子句,

https://www.postgresql.org/docs/current/static/sql-expressions.html

像<{1}}一样,BigQuery中没有,我真的希望他们可以实现它

答案 1 :(得分:2)

WITH t1 AS (
  SELECT user_id, ARRAY_AGG(DISTINCT field1) AS f1
  FROM t WHERE field1 IS NOT NULL
  GROUP BY user_id
),
t2 AS (
  SELECT user_id, ARRAY_AGG(DISTINCT field2) AS f2
  FROM t WHERE field2 IS NOT NULL
  GROUP BY user_id
)
SELECT t1.user_id, f1, f2
FROM t1 FULL JOIN t2 
ON t1.user_id = t2.user_id

答案 2 :(得分:2)

BigQuery 终于从 suggestion 实现了 Elliott Brossard,现在您可以:

SELECT user_id, ARRAY_AGG(DISTINCT field1 IGNORE NULLS) AS f1, ARRAY_AGG(DISTINCT field2 IGNORE NULLS) AS f2
FROM t
GROUP BY user_id

答案 3 :(得分:0)

使用ARRAY_AGG()遇到null时,请指定IGNORE NULL:

IGNORE NULLS或RESPECT NULLS:如果指定了IGNORE NULLS,则NULL 值从结果中排除。如果RESPECT NULLS或都不是 如果指定,则NULL值将包含在结果中。错误是 如果最终查询结果中的数组包含NULL元素,则会引发此错误。

来自:https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#array_agg

使用IGNORE NULL的工作示例:

SELECT
    category, 
    array_agg(fruit IGNORE NULLS) as fruits_array,
FROM (SELECT "A" as category, NULL as fruit UNION ALL
      SELECT "B" as category, "apple" as fruit UNION ALL
      SELECT "B" as category, "pear" as fruit UNION ALL
      SELECT "C" as category, "orange" as fruit)
GROUP BY category
ORDER BY category