将BigQuery数组解析为列的最佳方法是什么

时间:2019-12-17 15:34:47

标签: arrays struct google-bigquery

我有一张这样的桌子

select 'Alice' as Name, ['a=1','b=2','c=3'];

并且我希望它将其转换为

select 'Alice' as Name, 1 as a, 2 as b, 3 as c

做到这一点的最佳方法是什么?

我正在考虑也许首先使用结构

select 'Alice' as Name, [struct('a' as Letter, 1 as Number),struct('b' as Letter, 2 as Number) ,struct('c' as Letter, 3 as Number)]  as struct_column

1 个答案:

答案 0 :(得分:3)

假设您事先不知道“将来”列的名称和编号-我建议改用扁平化,如下面的示例(BigQuery Standard SQL)

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Alice' AS Name, ['a=1','b=2','c=3'] attributes UNION ALL
  SELECT 'Cheshire Cat', ['a=4', 'x=5'] UNION ALL
  SELECT 'White Rabbit', ['a=6', 'c=7'] 
)
SELECT Name, 
  SPLIT(kv, '=')[OFFSET(0)] key, 
  SPLIT(kv, '=')[SAFE_OFFSET(1)] value  
FROM `project.dataset.table`, UNNEST(attributes) kv   

有结果

Row Name            key value    
1   Alice           a   1    
2   Alice           b   2    
3   Alice           c   3    
4   Cheshire Cat    a   4    
5   Cheshire Cat    x   5    
6   White Rabbit    a   6    
7   White Rabbit    c   7     
  

我确实知道属性...

在这种情况下,可以使用

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Alice' AS Name, ['a=1','b=2','c=3'] attributes UNION ALL
  SELECT 'Cheshire Cat', ['a=4', 'b=5'] UNION ALL
  SELECT 'White Rabbit', ['a=6', 'c=7'] 
)
SELECT Name,
  MAX(IF(key = 'a', value, NULL)) a,
  MAX(IF(key = 'b', value, NULL)) b,
  MAX(IF(key = 'c', value, NULL)) c
FROM (
  SELECT Name, 
    SPLIT(kv, '=')[OFFSET(0)] key, 
    SPLIT(kv, '=')[SAFE_OFFSET(1)] value  
  FROM `project.dataset.table`, UNNEST(attributes) kv   
)
GROUP BY Name   

有结果

Row Name            a       b       c    
1   Alice           1       2       3    
2   Cheshire Cat    4       5       null     
3   White Rabbit    6       null    7