我有一列名为submission_date
的列,其中包含json单元格,如下所示:
{"submitted":["January 24, 2019","January 25, 2019","January 30,
2019","February 27, 2019"],"submission_canceled":["January 24,
2019","January 25, 2019"],"returned":"February 19, 2019"}
或类似这样:
{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}
我可以轻松地从“ submission_canceled”字段中获得第一个结果:
json_extract(submission_date, "$.submission_canceled[0]")
我想如果我想保持价值,我会这么做:
json_extract(submission_date, "$.submission_canceled[-1]")
但这只是给我一个空值。如您所见,有时submission_canceled
字段在列表中将有多个日期,而在其他时候,它将仅具有单个日期而不是在列表中。我想从submission_canceled
部分获取列表中的单个项目或最后一个项目。
答案 0 :(得分:2)
以下示例适用于BigQuery标准SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, '{"submitted":["January 24, 2019","January 25, 2019","January 30, 2019","February 27, 2019"],"submission_canceled":["January 24, 2019","January 25, 2019"],"returned":"February 19, 2019"}' submission_date UNION ALL
SELECT 2, '{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}'
)
SELECT id, REGEXP_REPLACE(ARRAY_REVERSE(SPLIT(JSON_EXTRACT(submission_date, '$.submission_canceled'), '","'))[OFFSET(0)], r'"|\[|\]', '') last_submission_canceled
FROM `project.dataset.table`
有结果
Row id last_submission_canceled
1 1 January 25, 2019
2 2 March 5, 2019
更新-以下是“较轻”版本
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, '{"submitted":["January 24, 2019","January 25, 2019","January 30, 2019","February 27, 2019"],"submission_canceled":["January 24, 2019","January 25, 2019"],"returned":"February 19, 2019"}' submission_date UNION ALL
SELECT 2, '{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}'
)
SELECT id, REGEXP_EXTRACT(JSON_EXTRACT(submission_date, '$.submission_canceled'), r'"([^"]*)"\]?$') last_submission_canceled
FROM `project.dataset.table`
具有明显相同的结果
Row id last_submission_canceled
1 1 January 25, 2019
2 2 March 5, 2019