对于以字符串形式出现的json数据,我希望具有类似JSON_EXTRACT_SCALAR的名称,但是要有灵活的结果列数。
这里是示例数据-不同的行可以具有不同的列名,并且可以嵌套json:
WITH `my_table` AS (
SELECT '{"sku_types":"{\"id\":\"5433306\",\"product_code\":\"adfklj_ewkj\"}","additional_info":"Face 30 ml","stock_level":"20+"}' as json_string
union all
SELECT '{"additional_info":"Face 100 ml","offer_info":"30%"}' as json_string
)
SELECT *
from my_table;
我希望将此数据提取到单独的列中:sku_types.id, sku_types.product_code, additional_info, stock_level, offer_info
。
这可以用SQL完成还是需要JavaScript?
我事先不知道json字段的名称,所以我无法使用JSON_EXTRACT_SCALAR
或JSON_EXTRACT
来做到这一点。
答案 0 :(得分:1)
以下BigQuery标准SQL示例
#standardSQL
CREATE TEMPORARY FUNCTION parseJson(y STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
var z = new Array();
processKey(JSON.parse(y), '');
function processKey(node, parent) {
Object.keys(node).map(function(key) {
value = node[key].toString();
if (value !== '[object Object]') {
if (parent !== '' && parent.substr(parent.length-1) !== '.') {
z.push(parent + '.' + key + ':' + value)
} else {
z.push(key + ':' + value)
}
} else {
if (parent !== '' && parent.substr(parent.length-1) !== '.') {parent += '.'};
processKey(node[key], parent + key);
};
});
};
return z
""";
WITH `my_table` AS (
SELECT 1 id, '{"sku_types":{"id":"5433306","product_code":"adfklj_ewkj"},"additional_info":"Face 30 ml","stock_level":"20+"}' AS json_string UNION ALL
SELECT 2, '{"additional_info":"Face 100 ml","offer_info":"30%"}' AS json_string
)
SELECT id,
ARRAY(
SELECT AS STRUCT SPLIT(kv, ':')[OFFSET(0)] key, SPLIT(kv, ':')[SAFE_OFFSET(1)] value
FROM UNNEST(parseJson(json_string)) kv
) params
FROM my_table
有结果
Row id params.key params.value
1 1 sku_types.id 5433306
sku_types.product_code adfklj_ewkj
additional_info Face 30 ml
stock_level 20+
2 2 additional_info Face 100 ml
offer_info 30%
您可以看到,而不是将所有可能的属性解析为单独的列(除非您事先知道它们,否则在这里是不可能的)-上述方法将它们压平为params数组内的key:value对
注意:在上面的示例中,我使用:
来构造key:value对,然后将它们拆分。如果您期望值具有此字符-您可以调整代码,而不用:
来使用更独特的内容-例如:::::::
快速更新以解决评论:
...问题是某些json值为null,在这种情况下,它会引发错误
#standardSQL
CREATE TEMPORARY FUNCTION parseJson(y STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
var z = new Array();
processKey(JSON.parse(y), '');
function processKey(node, parent) {
Object.keys(node).map(function(key) {
if (!node[key]) {
value = 'n/a'
} else {
value = node[key].toString();
}
if (value !== '[object Object]') {
if (parent !== '' && parent.substr(parent.length-1) !== '.') {
z.push(parent + '.' + key + ':' + value)
} else {
z.push(key + ':' + value)
}
} else {
if (parent !== '' && parent.substr(parent.length-1) !== '.') {parent += '.'};
processKey(node[key], parent + key);
};
});
};
return z
""";
WITH `my_table` AS (
SELECT 1 id, '{"sku_types":{"id":"5433306","product_code":"adfklj_ewkj"},"additional_info":"Face 30 ml","stock_level":"20+"}' AS json_string UNION ALL
SELECT 2, '{"additional_info":"Face 100 ml","offer_info":"30%"}' AS json_string union all
SELECT 3 as id , '{"offer_info":"30%", "price":null}' AS json_string
)
SELECT id,
ARRAY(
SELECT AS STRUCT SPLIT(kv, ':')[OFFSET(0)] key, SPLIT(kv, ':')[SAFE_OFFSET(1)] value
FROM UNNEST(parseJson(json_string)) kv
) params
FROM my_table
有结果
Row id params.key params.value
1 1 sku_types.id 5433306
sku_types.product_code adfklj_ewkj
additional_info Face 30 ml
stock_level 20+
2 2 additional_info Face 100 ml
offer_info 30%
3 3 offer_info 30%
price n/a
您可以看到她-我用'n/a'
替换了空值,但是您可以应用所需的任何逻辑