在雪花上展平 JSON 数据

时间:2021-04-30 08:04:22

标签: snowflake-cloud-data-platform

下面是我试图在雪花上压平的 Json 数据

Json 文档:

[
"empDetails": [
    {
        "kind": "person",
        "fullName": "John Doe",
        "age": 22,
        "gender": "Male",
        "phoneNumber": {
            "areaCode": "206",
            "number": "1234567"
        },
        "children": [
            {
                "name": "Jane",
                "gender": "Female",
                "age": "6"
            },
            {
                "name": "John",
                "gender": "Male",
                "age": "15"
            }
        ],
        "citiesLived": [
            {
                "place": "Seattle",
                "yearsLived": [
                    "1995"
                ]
            },
            {
                "place": "Stockholm",
                "yearsLived": [
                    "2005"
                ]
            }
        ]
    },
    {
        "kind": "person",
        "fullName": "Mike Jones",
        "age": 35,
        "gender": "Male",
        "phoneNumber": {
            "areaCode": "622",
            "number": "1567845"
        },
        "children": [
            {
                "name": "Earl",
                "gender": "Male",
                "age": "10"
            },
            {
                "name": "Sam",
                "gender": "Male",
                "age": "6"
            },
            {
                "name": "Kit",
                "gender": "Male",
                "age": "8"
            }
        ],
        "citiesLived": [
            {
                "place": "Los Angeles",
                "yearsLived": [
                    "1989",
                    "1993",
                    "1998",
                    "2002"
                ]
            },
            {
                "place": "Washington DC",
                "yearsLived": [
                    "1990",
                    "1993",
                    "1998",
                    "2008"
                ]
            },
            {
                "place": "Portland",
                "yearsLived": [
                    "1993",
                    "1998",
                    "2003",
                    "2005"
                ]
            },
            {
                "place": "Austin",
                "yearsLived": [
                    "1973",
                    "1998",
                    "2001",
                    "2005"
                ]
            }
        ]
    },
    {
        "kind": "person",
        "fullName": "Anna Karenina",
        "age": 45,
        "gender": "Female",
        "phoneNumber": {
            "areaCode": "425",
            "number": "1984783"
        },
        "citiesLived": [
            {
                "place": "Stockholm",
                "yearsLived": [
                    "1992",
                    "1998",
                    "2000",
                    "2010"
                ]
            },
            {
                "place": "Russia",
                "yearsLived": [
                    "1998",
                    "2001",
                    ""
                ]
            },
            {
                "place": "Austin",
                "yearsLived": [
                    "1995",
                    "1999"
                ]
            }
        ]
    }
]

}

在此数据中,我有 3 名员工及其详细信息,例如姓名、孩子、居住的城市 但其中一名员工“安娜卡列尼娜”的孩子详细信息不存在,但其他 2 名员工有孩子的数据。

由于缺少儿童详细信息,我无法展平第三个 emp 数据。

以下是我迄今为止尝试过的:

雪花压平 Json 代码:

select empd.value:kind,
empd.value:fullName,
empd.value:age,
empd.value:gender,   
--empd.value:phoneNumber,
empd.value:phoneNumber.areaCode, 
empd.value:phoneNumber.number ,
empd.value:children -- flattening childrean 
//chldrn.value:name,
//chldrn.value:gender,
//chldrn.value:age,
//city.value:place,
//yr.value:yearsLived
from my_json emp , lateral flatten(input=>emp.Json_data:empDetails) empd , 
lateral flatten(input=>empd.value:children) chldrn,
//lateral flatten(input=>empd.value:citiesLived) city,
//lateral flatten(input=>city.value:yearsLived) yr

1 个答案:

答案 0 :(得分:1)

您需要使用 OUTER 开关:

<块引用>

FLATTEN

OUTER => TRUE | FALSE
  • 如果为 FALSE,则任何无法扩展的输入行,无论是因为无法在路径中访问,还是因为它们有零个字段或条目,都将从输出中完全省略。

  • 如果为 TRUE,则为零行扩展恰好生成一行(在 KEY、INDEX 和 VALUE 列中为 NULL)。

select empd.value:kind,
  empd.value:fullName,
  empd.value:age,
  empd.value:gender,   
  empd.value:phoneNumber,
  empd.value:phoneNumber.areaCode, 
  empd.value:phoneNumber.number ,
  empd.value:children, 
  chldrn.value:name,
  chldrn.value:gender,
  chldrn.value:age,
  city.value:place,
  yr.value:yearsLived
from my_json emp,
  lateral flatten(input=>emp.Json_data:empDetails) empd , 
  lateral flatten(input=>empd.value:children, OUTER => TRUE) chldrn,   -- <HERE>
  lateral flatten(input=>empd.value:citiesLived) city,
  lateral flatten(input=>city.value:yearsLived) yr