bigquery可以查询flattern表并将其转换为嵌套数据结构

时间:2015-02-10 11:06:54

标签: google-bigquery

现在我遇到一个问题,即bigquery中有5个表, 表A有一条记录,表B中有5条记录与它有一些联系。以下是一个例子: 表A:记录a 表B:记录b,c,d,e,f

现在我使用这个sql:

select A.*,B.* from A join each B on A.xx = B.xx

,查询结果如下:a,b;a,c;a,d;a,e;a,f;(显示5行)

有没有想法让结果如下:

a,b c d e f; (show in 1 row)

希望你的帮助! 谢谢!

2 个答案:

答案 0 :(得分:1)

一种方法是使用GROUP_CONCAT

SELECT t1.c1,
       group_concat(t2.c2)
FROM
  (SELECT 'a' AS c1,
          1 AS k) t1
JOIN
  (SELECT *
   FROM
     (SELECT 'b' AS c2,
             1 AS k),
     (SELECT 'c' AS c2,
             1 AS k),
     (SELECT 'd' AS c2,
             1 AS k),
     (SELECT 'e' AS c2,
             1 AS k),
     (SELECT 'f' AS c2,
             1 AS k)) t2 ON t1.k=t2.k
GROUP BY t1.c1

这会产生:

+-----+-------+-----------+---+
| Row | t1_c1 |    f0_    |   |
+-----+-------+-----------+---+
|   1 | a     | b,c,d,e,f |   |
+-----+-------+-----------+---+

另一个是使用NEST(*免责声明适用于进一步阅读)

SELECT t1.c1,
       nest(t2.c2)
FROM
  (SELECT 'a' AS c1,
          1 AS k) t1
JOIN
  (SELECT *
   FROM
     (SELECT 'b' AS c2,
             1 AS k),
     (SELECT 'c' AS c2,
             1 AS k),
     (SELECT 'd' AS c2,
             1 AS k),
     (SELECT 'e' AS c2,
             1 AS k),
     (SELECT 'f' AS c2,
             1 AS k)) t2 ON t1.k=t2.k
GROUP BY t1.c1

但必须将此写入表中,因为BigQuery会自动展平查询结果,因此如果您在顶级查询中使用NEST函数,则结果不会包含重复字段。使用产生中间结果的子选择时,请使用NEST函数,以供同一查询立即使用。

默认情况下,在界面上,BigQuery会展平所有查询结果。要保留嵌套和重复的结果,请选择目标表并启用“允许大结果”,然后取消选中Flatten results选项。

*但是当您使用写入目标表时,此功能存在已知错误:Save a result set containing repeated field to a destination table - 如果您没有将结果保存到目标表,那可能就是好的。

答案 1 :(得分:1)

解决Pentium10回答的限制 - 请参阅BigQuery creat repeated record field from query

的回答中的解决方法

允许

1. mimic NEST() for multiple fileds   
2. save result directly to table

因此解决了两个当前的限制

a. NEST function accepts only one field  
b. NEST is not compatible with unFlatten Results Output