从雪花中的坏数据中清除 JSON 列

时间:2021-05-24 14:17:59

标签: json json.net snowflake-cloud-data-platform

我在 Snowflake 中有一个包含许多列的表,其中一列是名为 RECORD 的 JSON 列(检查附件图片),因此 JSON 列中有多个列,其中一些列包含错误数据,例如 ( // , * 或 \\ ) 并且我想在雪花中创建函数以从错误数据中清除此 JSON 列,我的 JSON 在下面

{
  "F000 //Start Date": "3/3/2016 21:04",
  "F001 End Date": "3/3/2016 23:20",
  "F002 *Response Type//": "IP Address",
  "F003 IP Address": "166.170.14.93",
  "F004 SurveyName\\": "6 Month",
} 

enter image description here

1 个答案:

答案 0 :(得分:0)

一个简单的方法可能是实现一个 Javascript UDF 返回清理后的 JSON,就像这样。

CREATE OR REPLACE FUNCTION clean(record variant)
  RETURNS variant
  LANGUAGE JAVASCRIPT
  AS $$
  cleaned = {}
  for (const property in RECORD) {
    cleaned[property.replace(/\/\/|\\|\*/g, '')] = RECORD[property]
  }
  return cleaned
  $$;

with tbl as (select parse_json($1) record from values ('{
  "F000 //Start Date": "3/3/2016 21:04",
  "F001 End Date": "3/3/2016 23:20",
  "F002 *Response Type//": "IP Address",
  "F003 IP Address": "166.170.14.93",
  "F004 SurveyName\\\\": "6 Month",
}'))

select clean(record) from tbl;

{
  "F000 Start Date": "3/3/2016 21:04",
  "F001 End Date": "3/3/2016 23:20",
  "F002 Response Type": "IP Address",
  "F003 IP Address": "166.170.14.93",
  "F004 SurveyName": "6 Month"
}