我有一个数据集,其中有一个带有对象数组的列,如下所示:
ID TAGS
1 {"tags": [{"tag": "a"}, {"tag": "b"}]}
2 {"tags": [{"tag": "c"}, {"tag": "d"}]}
我想提取数组每个元素的tag
字段,所以最终结果将是:
ID TAGS
1 ["a","b"]
2 ["c","d"]
假设下表t1
:
CREATE OR REPLACE TEMPORARY TABLE t1 AS (
select 1 as ID , PARSE_JSON('{"tags": [{"tag":"a"}, {"tag":"b"}]}') AS PAYLOAD
UNION ALL
select 2, PARSE_JSON('{"tags": [{"tag":"c"}, {"tag":"d"}]}')
);
答案 0 :(得分:1)
一种可能的解决方案是创建一个javascript函数,并使用javascript .map()将一个函数应用于数组的每个元素:
create or replace function extract_tags(a array)
returns array
language javascript
strict
as '
return A.map(function(d) {return d.tag});
';
SELECT ID, EXTRACT_TAGS(PAYLOAD:tags) AS tags from t1;
这给出了预期的结果:
ID TAGS
1 [ "a", "b" ]
2 [ "c", "d" ]
答案 1 :(得分:1)
一种纯SQL方法是像这样组合LATERAL FLATTEN和ARRAY_AGG:
with t2 as (
select ID, t2.value:tag as tag
from t1, LATERAL FLATTEN(input => payload:tags) t2
)
select t2.id, ARRAY_AGG(t2.tag) as tags from t2
group by ID
order by ID ASC;
t2本身将变为:
ID TAG
1 "a"
1 "b"
2 "c"
2 "d"
,在GROUP BY ID
之后变成:
ID TAGS
1 [ "a", "b" ]
2 [ "c", "d" ]