Google BigQuery:表连接与REPEATED RECORD值

时间:2017-11-13 19:10:45

标签: arrays join google-bigquery standard-sql

我有一张这样的表:

Logger.info

前两列是标准字段,第三列是REPEATED RECORD字段。

我还有2张桌子

Row field1  field2  field3.a    field3.b     
1   value1  1       id1         key5     
                    id2         key6     
2   value2  2       id3         key7     
                    id4         key8

Row  id     value    
1    id1    my-valueA    
2    id2    my-valueB    
3    id3    my-valueC    
4    id4    my-valueD

哪个映射来自第一个表的id / key(我使用术语Row id value 1 key5 my-valueE 2 key6 my-valueF 3 key7 my-valueG 4 key8 my-valueH key以避免混淆,但最后概念是相同的)具有给定值

这是一个重现结构的完整陈​​述

id

我的目标是使用其他2个表给出的值替换第一个表中的键/ id值,例如id上的经典连接。

这是预期的输出

#standardSQL
WITH my_table AS (
  SELECT "value1" as field1, 1 as field2, [STRUCT("id1" as a, "key5" as b),STRUCT("id2" as a, "key6" as b)] as field3
  UNION ALL
  SELECT "value2" as field2, 2 as field2, [STRUCT("id3" as a, "key7" as b),STRUCT("id4" as a, "key8" as b)] as field3
),

ids_table AS (
  SELECT "id1" as id, "my-valueA" as value
  UNION ALL
  SELECT "id2" as id, "my-valueB" as value
  UNION ALL
  SELECT "id3" as id, "my-valueC" as value
  UNION ALL
  SELECT "id4" as id, "my-valueD" as value
),

keys_table AS (
  SELECT "key5" as id, "my-valueE" as value
  UNION ALL
  SELECT "key6" as id, "my-valueF" as value
  UNION ALL
  SELECT "key7" as id, "my-valueG" as value
  UNION ALL
  SELECT "key8" as id, "my-valueH" as value
)

-- SELECT * FROM my_table
-- SELECT * FROM ids_table
-- SELECT * FROM keys_table

首先我考虑使用Row field1 field2 t2_value t3_value 1 value1 1 my-valueA my-valueE my-valueB my-valueF 2 value2 2 my-valueC my-valueG my-valueD my-valueH 运算符来获取扁平线,因此可以使用简单的JOIN来解析值,然后使用替换的值重新加入数组。

UNNEST

使用此语句,值正确地从id替换为值,但现在我无法重现之前的RECORD REPEATED结构

1 个答案:

答案 0 :(得分:2)

以下是BigQuery Standard SQL

  
#standardSQL
WITH my_table AS (
  SELECT "value1" AS field1, 1 AS field2, [STRUCT("id1" AS a, "key5" AS b),STRUCT("id2" AS a, "key6" AS b)] AS field3 UNION ALL
  SELECT "value2" AS field2, 2 AS field2, [STRUCT("id3" AS a, "key7" AS b),STRUCT("id4" AS a, "key8" AS b)] AS field3
), ids_table AS (
  SELECT "id1" AS id, "my-valueA" AS value UNION ALL
  SELECT "id2" AS id, "my-valueB" AS value UNION ALL
  SELECT "id3" AS id, "my-valueC" AS value UNION ALL
  SELECT "id4" AS id, "my-valueD" AS value
),keys_table AS (
  SELECT "key5" AS id, "my-valueE" AS value UNION ALL
  SELECT "key6" AS id, "my-valueF" AS value UNION ALL
  SELECT "key7" AS id, "my-valueG" AS value UNION ALL
  SELECT "key8" AS id, "my-valueH" AS value
)
SELECT 
  field1, field2,
  (
    SELECT ARRAY_AGG(STRUCT<a_value STRING, b_value STRING>(t2.value, t3.value)) 
    FROM UNNEST(field3) t1
      LEFT JOIN ids_table AS t2 ON t1.a = t2.id
      LEFT JOIN keys_table AS t3 ON t1.b = t3.id
  ) AS field3
FROM my_table    

以下输出

field1  field2  field3.a_value  field3.b_value   
value1  1       my-valueA       my-valueE    
                my-valueB       my-valueF    
value2  2       my-valueC       my-valueG    
                my-valueD       my-valueH