将ARRAY <struct>转换为BigQuery SQL中的多个列

时间:2018-08-10 18:13:38

标签: sql google-bigquery

我正在尝试将Array 转换为多列。 数据结构如下:

column name: Parameter
[
  -{
      key: "Publisher_name"
      value: "Rubicon"
   }
  -{
      key: "device_type"
      value: "IDFA"
   }
  -{
      key: "device_id"
      value: "AAAA-BBBB-CCCC-DDDD"
   }
] 

我想要得到什么:

publisher_name  device_type  device_id
Rubicon         IDFA         AAAA-BBBB-CCCC-DDDD

我已经尝试过这样做,导致其他列重复。

select h from table unnest(parameter) as h

顺便说一句,我很好奇为什么我们要在Bigquery中使用这种结构。我们不能只将以上三列添加到表中吗?

2 个答案:

答案 0 :(得分:1)

要转换为多列,您需要进行汇总,如下所示:

select ?,
       max(case when pv.parameter = 'Publisher_name' then value end) as Publisher_name,
       max(case when pv.parameter = 'device_type' then value end) as device_type,
       max(case when pv.parameter = 'device_id' then value end) as device_id
from t cross join
     unnest(parameter) pv
group by ?

您需要明确列出所需的新列。 ?用于保持不变的列。

答案 1 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT 
  (SELECT value FROM UNNEST(Parameter) WHERE key = 'Publisher_name') AS Publisher_name,
  (SELECT value FROM UNNEST(Parameter) WHERE key = 'device_type') AS device_type,
  (SELECT value FROM UNNEST(Parameter) WHERE key = 'device_id') AS device_id
FROM `project.dataset.table`

您可以使用如下所示的SQL UDF进一步重构代码

#standardSQL
CREATE TEMP FUNCTION getValue(k STRING, arr ANY TYPE) AS
((SELECT value FROM UNNEST(arr) WHERE key = k));
SELECT 
  getValue('Publisher_name', Parameter) AS Publisher_name,
  getValue('device_type', Parameter) AS device_type,
  getValue('device_id', Parameter) AS device_id
FROM `project.dataset.table`