雪花从表数据创建json结构

时间:2021-04-12 19:04:42

标签: json snowflake-cloud-data-platform

我有一个视图,它以以下格式返回数据,我希望将其 UPSERTED 放入表中,此外我希望将行值加载到 JSON 结构的新列中。 JSON 结构需要基于 ID 和 NAME 列形成,其余的需要在 JSON 内部形成数组。

示例源数据 enter image description here

我希望将值加载到与上述类似的目标表中,另外需要通过对 ID 和 NAME 列进行分组来形成 JSON 结构化值,并加载到变体类型的 JSON 列中。

样本目标表

enter image description here

需要为 JSON1、JSON2 和 JSON 3 填充示例 JSON 列值如下。非常感谢任何有助于实现预期结果的帮助。谢谢

JSON1

[
    {
        "col1": "1",
        "col2": "TEST001",
        "war_detail":[
                        {   "War_id": "WR001",
                            "War_start_date": "1/1/1970",
                            "War_end_date": "12/12/9999"
                        }
                    ],
        "Con_details": [
                            {   "Cont_id": "CON001",
                                "Con_start_date": "1/1/1970",
                                "Con_end_date": "12/12/9999"
                            },
                            {   "Cont_id": "CON002",
                                "Con_start_date": "1/1/2000",
                                "Con_end_date": "12/12/9999"
                            }
                        ]
    },
    {
        "col1": "9",
        "col2": "TEST001",
        "war_detail":[
                        {   "War_id": "WR001",
                            "War_start_date": "1/1/1970",
                            "War_end_date": "12/12/9999"
                        }
                    ],
        "Con_details": [
                            {   "Cont_id": "CON123",
                                "Con_start_date": "1/1/2010",
                                "Con_end_date": "12/12/9999"
                            }
                        ]
    },
    
    
]

---------------------------------------
JSON2

[
    JSON:{
            "col1": "2",
            "col2": "TEST002",
            "war_detail":[
                            {   "War_id": "WR987",
                                "War_start_date": "1/1/1970",
                                "War_end_date": "12/12/9999"
                            },
                            {   "War_id": "WR123",
                                "War_start_date": "1/1/1990",
                                "War_end_date": "12/12/9999"
                            }
                        ],
            "Con_details": [
                                {   "Cont_id": "CON003",
                                    "Con_start_date": "1/1/2020",
                                    "Con_end_date": "12/12/9999"
                                }
                            ]
        }
]
---------------------------------------
JSON3

[
    JSON:{
            "col1": "2",
            "col2": "TEST002",
            "war_detail":[
                            {   "War_id": "WR678",
                                "War_start_date": "1/1/2001",
                                "War_end_date": "12/12/2023"
                            },
                            {   "War_id": "WR004",
                                "War_start_date": "1/1/2010",
                                "War_end_date": "12/12/2030"
                            }
                        ],
            "Con_details": []
        }
]

1 个答案:

答案 0 :(得分:0)

首先,图片很有帮助,但实际上包含值的文本块使得将值复制到 SQL 中比手动输入要好得多。

但是对数据使用 CTE:

WITH data AS (
    SELECT * FROM VALUES
        (1, 'test001', 'WR001', '1970-01-01', '9999-12-12', 'CON001', '1970-01-01', '9999-12/12'),
        (1, 'test001', 'WR001', '1970-01-01', '9999-12-12', 'CON002', '1970-01-01', '9999-12/12'),
        (9, 'test001', 'WR001', '1970-01-01', '9999-12-12', 'CON123', '1970-01-01', '9999-12/12'),
        (2, 'test002', 'WR987', '2020-01-01', '9999-12-12', 'CON003', '1970-01-01', '9999-12/12'),
        (2, 'test002', 'WR123', '1990-01-01', '9999-12-12', 'CON003', '1970-01-01', '9999-12/12'),
        (3, 'test003', 'WR678', '2001-01-01', '2023-12-12', null, null, null),
        (3, 'test003', 'WR004', '2010-01-01', '2030-12-12', null, null, null)
        v(id, name, war_id, war_start_date, war_end_date, cont_id, con_start_date, con_end_date)
)

假设您的 JSON1,JSON2,JSON3 有效 NAME 那么

我们可以嵌套一些 OBJECT_CONSTRUCTARRAY_AGG 以按预期构建数据。

SELECT 
    array_agg(war_block) WITHIN GROUP (ORDER BY war_block:col1) as json_block
FROM (
    SELECT name,
        object_construct('col1', id, 'col2', name, 'war_detail', a_war, 'con_details', a_con) as war_block
    FROM (
        SELECT id
            ,name
            ,array_agg(distinct war) WITHIN GROUP (ORDER BY war:war_start_date) AS a_war
            ,array_agg(distinct con) WITHIN GROUP (ORDER BY con:con_start_date) AS a_con
        FROM (
            SELECT id
                ,name
                ,object_construct('war_id', war_id, 'war_start_date', war_start_date, 'war_end_date', war_end_date) as war
                ,object_construct('cont_id', cont_id, 'con_start_date', con_start_date, 'con_end_date', con_end_date) as con
            FROM data
        )
        GROUP BY id, name
    )
)
GROUP BY name
ORDER BY name;

给出:

JSON_BLOCK
[    {      "col1": 1,      "col2": "test001",      "con_details": [        {          "con_end_date": "9999-12/12",          "con_start_date": "1970-01-01",          "cont_id": "CON001"        },        {          "con_end_date": "9999-12/12",          "con_start_date": "1970-01-01",          "cont_id": "CON002"        }      ],      "war_detail": [        {          "war_end_date": "9999-12-12",          "war_id": "WR001",          "war_start_date": "1970-01-01"        }      ]    },    {      "col1": 9,      "col2": "test001",      "con_details": [        {          "con_end_date": "9999-12/12",          "con_start_date": "1970-01-01",          "cont_id": "CON123"        }      ],      "war_detail": [        {          "war_end_date": "9999-12-12",          "war_id": "WR001",          "war_start_date": "1970-01-01"        }      ]    }  ]
[    {      "col1": 2,      "col2": "test002",      "con_details": [        {          "con_end_date": "9999-12/12",          "con_start_date": "1970-01-01",          "cont_id": "CON003"        }      ],      "war_detail": [        {          "war_end_date": "9999-12-12",          "war_id": "WR123",          "war_start_date": "1990-01-01"        },        {          "war_end_date": "9999-12-12",          "war_id": "WR987",          "war_start_date": "2020-01-01"        }      ]    }  ]
[    {      "col1": 3,      "col2": "test003",      "con_details": [        {}      ],      "war_detail": [        {          "war_end_date": "2023-12-12",          "war_id": "WR678",          "war_start_date": "2001-01-01"        },        {          "war_end_date": "2030-12-12",          "war_id": "WR004",          "war_start_date": "2010-01-01"        }      ]    }  ]

要注意属性的顺序是不一样的,但它们没有顺序,但数组是按照我假设你想要的顺序排序的。