如何将嵌套的json键/值对展平为单个值数组?

时间:2019-05-27 18:16:20

标签: sql snowflake-datawarehouse

在SNOWFLAKE中,我的数据结构如下:


ORGANIZATION TABLE
------------------
Org:variant
------------------
{
    relationships: [{
        { name: 'mother', value: a },
        { name: 'siblings', value: [ 'c', 'd' ] }
    }]
}

PEOPLE TABLE
-------------------
Person:variant
-------------------
{
    id: a
    name: Mary
}
-------------------
{
    id: b
    name: Joe
}
-------------------
{
    id: c
    name: John
}

我想要一个结果:

ORGANIZATION                                       | PEOPLE
---------------------------------------------------|----------------------------
{                                                  |[
    relationships: [{                              |  {
        { name: 'mother', value: a },              |    id: a,
        { name: 'siblings', value: [ 'c', 'd' ] }  |    name: Mary
    }]                                             |  },
}                                                  |  {
                                                   |    id: b,
                                                   |    name: Joe
                                                   |  },
                                                   |  {
                                                   |    id: c,
                                                   |    name: john
                                                   |  }
                                                   |]

我确定以某种方式涉及了ARRAY_AGG,但我不知如何将结果汇总到单个值数组中。

我当前的查询:

SELECT Org, ARRAY_AGG(Person) as People
FROM Organizations
INNER JOIN People ON People.id IN Org.relationships...?? (I'm lost here)
GROUP BY Org

1 个答案:

答案 0 :(得分:3)

以下查询说明了如何使用FLATTEN和ARRAY_AGG获得所需的输出。

  • FLATTEN取消嵌套每个数组,以便您可以加入其中的值。
  • ARRAY_AGG汇总按组织分组的值。
  • CASE语句说明org.relationships并不总是一个数组。
CREATE OR REPLACE TABLE organizations (org variant) AS
SELECT parse_json('{relationships: [{ name: "mother", value: "a" }, { name: "siblings", value: [ "b", "c" ] } ] } ');


CREATE OR REPLACE TABLE people (person variant) AS
SELECT parse_json($1)
FROM
VALUES ('{id:"a", name: "Mary"}'),
       ('{id:"b", name: "Joe"}'), 
       ('{id:"c", name: "John"}');

WITH org_people AS
  (SELECT o.org,
          relationship.value AS relationship,
          CASE is_array(relationship:value)
              WHEN TRUE THEN person_in_relationship.value
              ELSE relationship:value
          END AS person_in_relationship
   FROM organizations o,
        LATERAL FLATTEN(o.org:relationships) relationship ,
        LATERAL FLATTEN(relationship.value:value, OUTER=>TRUE) person_in_relationship
  )
SELECT op.org,
       ARRAY_AGG(p.person) AS people
FROM org_people op
JOIN people p ON p.person:id = op.person_in_relationship
GROUP BY op.org;