从平面到多层嵌套数据

时间:2018-06-08 17:20:34

标签: sql postgresql

在BQ中,我使用ARRAY_AGG(STRUCT(...来重构一些平面数据,但希望更进一步:在记录数组中创建另一个记录数组。 尽管在PostgreSQL中不存在STRUCT,但我很感兴的是如何解决这个问题。

考虑平面数据:

WITH a AS (
SELECT 'ABC' company, 'adress1' address, 'name1' name, 'email1' email, 'work' ph_type, '+123' ph_nr
UNION ALL
SELECT 'ABC' company, 'adress1' address, 'name1' name, 'email1' email, 'cell' ph_type, '+987'
 UNION ALL
SELECT 'DEF' company, 'adress2' address, 'name2' name, 'email2' email, 'work' ph_type, '+127'
 UNION ALL
SELECT 'DEF' company, 'adress2' address, 'name2' name, 'email2' email, 'cell' ph_type, '+988'
 UNION ALL
SELECT 'XYZ' company, 'adress3' address, 'name3' name, 'email3' email, 'work' ph_type, '+456'
)

我可以像这样嵌套contact

SELECT company, address, ARRAY_AGG(STRUCT(name, email, ph_type, ph_nr)) contact
FROM a
GROUP BY company, address
ORDER BY 1

但我如何在同一个选择语句,phones中嵌套contact中的记录数组)?

对于第一次接触,JSON表示看起来像是:

[
 {
  "company": "ABC",
  "address": "adress1",
  "contact": [
    {
      "name": "name1",        
      "email": "email1",
      "phone": [
        {
         "ph_type": "work",
         "ph_nr": "+123"
        },
        {
         "ph_type": "cell",
         "ph_nr": "+987"
        }
    },
   ...

这可以通过WITH子句或子选择来顺序处理聚合,但不确定这会表现良好(数据读取两次?)。

我每天都有600M的记录要解析,所以想知道最有效的方式。

编辑:更正了名称定义

1 个答案:

答案 0 :(得分:2)

您的问题的答案是两个级别的聚合。

然而,问题本身让我感到困惑,因为查询使用name,但数据中没有定义。

以下是该怎么做的例子:

SELECT company, address, ARRAY_AGG(STRUCT(email, phones)) as contact
FROM (SELECT company, name, address, email, ARRAY_AGG(STRUCT(ph_type, ph_nr)) as phones
      FROM a
      GROUP BY company, name, address, email
     ) a
GROUP BY company, address
ORDER BY 1