解析雪花中的json键值,键中包含点

时间:2020-02-06 21:00:18

标签: sql json snowflake-cloud-data-platform

{
  "deviceLocale": "en_US",
  "deviceSerialNumber": "xxxxxxxxxx",
  "eventSource": "abc",
  "ext.user.browser": "Mobile Safari",
  "ext.user.browser.version": "1.0.4",
  "ext.user.device.family": "iPhone", 
  "ext.user.os": "iOS",
  "ext.user.os.version": "1.3.0",
  "Timestamp": 158007896874 }

这是我的示例json。

在雪花中解析

```
select distinct
eve_id,
json_payload:ext.useragent.device.family::varchar as type,
json_payload:ext.useragent.os::varchar as osname,
json_payload:ext.useragent.os.version::varchar as os 
from XYZ table, lateral flatten (input => json_payload)
```

但是所有这三个字段都给出了NULL值,我看到了json格式的数据。因此,我猜解析不正确。我在解析时会在雪花中知道我们是否使用点或:然后它指的是嵌套键。但就我而言,我有一个没有嵌套键的简单json。

有什么主意吗?

1 个答案:

答案 0 :(得分:1)

首先,您可以将名称用双引号引起来,例如:

SELECT parse_json('{
        "deviceLocale": "en_US",
        "deviceSerialNumber": "xxxxxxxxxx",
        "eventSource": "abc",
        "ext.user.browser": "Mobile Safari",
        "ext.user.browser.version": "1.0.4",
        "ext.user.device.family": "iPhone", 
        "ext.user.os": "iOS",
        "ext.user.os.version": "1.3.0",
        "Timestamp": 158007896874 }') AS json_payload,
    json_payload:"ext.user.device.family"::varchar as type,
    json_payload:"ext.user.os"::varchar as osname,
    json_payload:"ext.user.os.version"::varchar as os;

给予:

JSON_PAYLOAD    TYPE    OSNAME  OS
{    "Timestamp": 158007896874,    "deviceLocale": "en_US",    "deviceSerialNumber": "xxxxxxxxxx",    "eventSource": "abc",    "ext.user.browser": "Mobile Safari",    "ext.user.browser.version": "1.0.4",    "ext.user.device.family": "iPhone",    "ext.user.os": "iOS",    "ext.user.os.version": "1.3.0"  }    iPhone  iOS 1.3.0

或者您可以使用['']之类的json_payload['ext.user.os.version']::varchar as os格式,这样可以避免使用双引号(以防万一)。

在访问SQL时,您有json_payload:ext.useragent.device.family::varchar,但是useragent部分在JSON中仅user。这样会给您带来麻烦。

在您的示例中,您还使用了LATERAL FLATTEN,但询问如何访问要展平的对象的展平成员。因此,不需要展平。但是,如果您想扁平化,那么每个顶级项都会有一行,这时您将需要过滤key ..但是我怀疑这不是您要尝试的操作。但是,如果您是个好主意,则可以对拼合进行别名化以帮助显示意图。

WITH jp AS (
    SELECT parse_json('{
      "deviceLocale": "en_US",
      "deviceSerialNumber": "xxxxxxxxxx",
      "eventSource": "abc",
      "ext.user.browser": "Mobile Safari",
      "ext.user.browser.version": "1.0.4",
      "ext.user.device.family": "iPhone", 
      "ext.user.os": "iOS",
      "ext.user.os.version": "1.3.0",
      "Timestamp": 158007896874 }') AS json_payload
)
SELECT 
    f.key,
    f.path,
    f.value
FROM jp, LATERAL FLATTEN (input => json_payload) f;

给予:

KEY PATH    VALUE
Timestamp   Timestamp   158007896874
deviceLocale    deviceLocale    "en_US"
deviceSerialNumber  deviceSerialNumber  "xxxxxxxxxx"
eventSource eventSource "abc"
ext.user.browser    ['ext.user.browser']    "Mobile Safari"
ext.user.browser.version    ['ext.user.browser.version']    "1.0.4"
ext.user.device.family  ['ext.user.device.family']  "iPhone"
ext.user.os ['ext.user.os'] "iOS"
ext.user.os.version ['ext.user.os.version'] "1.3.0"