如何将函数应用于数组列的每个元素?

时间:2019-09-30 18:49:34

标签: snowflake-data-warehouse

我有一个数据集,其中有一个带有对象数组的列,如下所示:

ID   TAGS
1    {"tags": [{"tag": "a"}, {"tag": "b"}]}
2    {"tags": [{"tag": "c"}, {"tag": "d"}]}

我想提取数组每个元素的tag字段,所以最终结果将是:

ID   TAGS
1    ["a","b"]
2    ["c","d"]

假设下表t1

CREATE OR REPLACE TEMPORARY TABLE t1 AS (
      select 1 as ID , PARSE_JSON('{"tags": [{"tag":"a"}, {"tag":"b"}]}') AS PAYLOAD
    UNION ALL
    select 2, PARSE_JSON('{"tags": [{"tag":"c"}, {"tag":"d"}]}')

);

2 个答案:

答案 0 :(得分:1)

一种可能的解决方案是创建一个javascript函数,并使用javascript .map()将一个函数应用于数组的每个元素:

create or replace function extract_tags(a array)
  returns array
  language javascript
  strict
  as '

  return A.map(function(d) {return d.tag});
  ';

SELECT ID, EXTRACT_TAGS(PAYLOAD:tags) AS tags from t1;

这给出了预期的结果:

ID  TAGS
1   [    "a",    "b"  ]
2   [    "c",    "d"  ]

答案 1 :(得分:1)

一种纯SQL方法是像这样组合LATERAL FLATTENARRAY_AGG

with t2 as (
    select ID, t2.value:tag as tag
    from t1, LATERAL FLATTEN(input => payload:tags) t2
)
select t2.id, ARRAY_AGG(t2.tag) as tags from t2
group by ID 
order by ID ASC;

t2本身将变为:

ID  TAG
1   "a"
1   "b"
2   "c"
2   "d"

,在GROUP BY ID之后变成:

ID  TAGS
1   [    "a",    "b"  ]
2   [    "c",    "d"  ]