将重复记录转换为重复字符串数组

时间:2017-12-27 13:03:28

标签: google-bigquery

我有一个表A,其中一列是重复记录,如

                +- children: record (repeated)
                |  |- name: string
                |  |- gender: string
                |  |- age: integer

我有一个表B,其中一列是STRING(重复)

                +- names : string (repeated) 

寻找将表中的名称列表从RECORD内部移动到表B的字符串数组的选项。

任何建议都会有很大的帮助

2 个答案:

答案 0 :(得分:2)

您可以使用ARRAY功能。试试这个:

#standardSQL
SELECT
  ARRAY_TO_STRING(
    ARRAY(SELECT name FROM UNNEST(children))
  ) AS names
FROM `dataset.table`

它只从结构中的name字段创建新数组,然后将数组转换为字符串。

答案 1 :(得分:0)

以下是BigQuery Standard SQL

如果您希望获得阵列,可以使用下面的

   
#standardSQL
SELECT ARRAY(SELECT name FROM UNNEST(children)) AS names
FROM `yourproject.yourdataset.yourtable`

您可以使用伪数据

来测试/播放它
#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
  SELECT [STRUCT<name STRING, gender STRING, age INT64>('abc1','m',12),('xyz1','m',13),('uvw1','f',14)] children UNION ALL
  SELECT [STRUCT<name STRING, gender STRING, age INT64>('abc2','f',12),('xyz2','m',13),('uvw2','f',14)] 
)
SELECT ARRAY(SELECT name FROM UNNEST(children)) AS names
FROM `yourproject.yourdataset.yourtable`

输出

Row names    
1   abc1     
    xyz1     
    uvw1     
2   abc2     
    xyz2     
    uvw2     

如果你想要字符串

#standardSQL
SELECT (SELECT STRING_AGG(name) FROM UNNEST(children)) AS names
FROM `yourproject.yourdataset.yourtable`   

您可以使用相同的虚拟数据进行测试/播放

#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
  SELECT [STRUCT<name STRING, gender STRING, age INT64>('abc1','m',12),('xyz1','m',13),('uvw1','f',14)] children UNION ALL
  SELECT [STRUCT<name STRING, gender STRING, age INT64>('abc2','f',12),('xyz2','m',13),('uvw2','f',14)] 
)
SELECT (SELECT STRING_AGG(name) FROM UNNEST(children)) AS names
FROM `yourproject.yourdataset.yourtable`   

现在输出

Row names    
1   abc1,xyz1,uvw1   
2   abc2,xyz2,uvw2