我有几个字符串列和一个数组列。我的要求是将数组转换为字符串,并与其他字符串列串联,以在串联的字符串列上执行MD5函数
但是将数组强制转换为String是不可能的,我也尝试使用explode和inline函数来提取数组内容,但到目前为止还算运气
关于如何实现这一目标的任何想法
答案 0 :(得分:2)
展开数组并获取struct元素,使用struct元素构建所需的字符串并收集字符串数组,使用concat_ws将其转换为字符串,然后与其他列连接。像这样:
with mydata as (
select ID, my_array
from
( --some array<struct> example
select 1 ID, array(named_struct("city","Hudson","state","NY"),named_struct("city","San Jose","state","CA"),named_struct("city","Albany","state","NY")) as my_array
union all
select 2 ID, array(named_struct("city","San Jose","state","CA"),named_struct("city","San Diego","state","CA")) as my_array
)s
)
select ID, concat(ID,'-', --'-' is a delimiter
concat_ws(',',collect_list(element)) --collect array of strings and concatenate it using ',' delimiter
) as my_string --concatenate with ID column also
from
(
select s.ID, concat_ws(':',a.mystruct.city, mystruct.state) as element --concatenate struct using : as a delimiter Or concatenate in some other way
from mydata s
lateral view explode(s.my_array) a as mystruct
)s
group by ID
;
返回:
OK
1 1-Hudson:NY,San Jose:CA,Albany:NY
2 2-San Jose:CA,San Diego:CA
Time taken: 63.368 seconds, Fetched: 2 row(s)
使用INLINE可以分解结构元素
with mydata as (
select ID, my_array
from
( --some array<struct> example
select 1 ID, array(named_struct("city","Hudson","state","NY"),named_struct("city","San Jose","state","CA"),named_struct("city","Albany","state","NY")) as my_array
union all
select 2 ID, array(named_struct("city","San Jose","state","CA"),named_struct("city","San Diego","state","CA")) as my_array
)s
)
select s.ID, a.city, a.state
from mydata s
lateral view inline(s.my_array) a as city, state
;
并根据需要再次将它们连接到字符串中,收集数组,concat_ws等