字符串和array <struct>列的配置单元串联

时间:2019-08-02 05:20:23

标签: arrays hive concatenation hiveql

我有几个字符串列和一个数组列。我的要求是将数组转换为字符串,并与其他字符串列串联,以在串联的字符串列上执行MD5函数

但是将数组强制转换为String是不可能的,我也尝试使用explode和inline函数来提取数组内容,但到目前为止还算运气

关于如何实现这一目标的任何想法

1 个答案:

答案 0 :(得分:2)

展开数组并获取struct元素,使用struct元素构建所需的字符串并收集字符串数组,使用concat_ws将其转换为字符串,然后与其他列连接。像这样:

with mydata as (
select ID, my_array  
from
( --some array<struct> example
 select 1 ID, array(named_struct("city","Hudson","state","NY"),named_struct("city","San Jose","state","CA"),named_struct("city","Albany","state","NY")) as my_array
 union all
 select 2 ID, array(named_struct("city","San Jose","state","CA"),named_struct("city","San Diego","state","CA")) as my_array
)s
)


select ID, concat(ID,'-', --'-' is a delimiter
                 concat_ws(',',collect_list(element)) --collect array of strings and concatenate it using ',' delimiter
                 ) as my_string --concatenate with ID column also
from
(
select s.ID, concat_ws(':',a.mystruct.city, mystruct.state) as element --concatenate struct using : as a delimiter Or concatenate in some other way
  from mydata s 
       lateral view explode(s.my_array) a as mystruct
)s 
group by ID 
; 

返回:

OK
1       1-Hudson:NY,San Jose:CA,Albany:NY
2       2-San Jose:CA,San Diego:CA
Time taken: 63.368 seconds, Fetched: 2 row(s)

使用INLINE可以分解结构元素

with mydata as (
select ID, my_array  
from
( --some array<struct> example
 select 1 ID, array(named_struct("city","Hudson","state","NY"),named_struct("city","San Jose","state","CA"),named_struct("city","Albany","state","NY")) as my_array
 union all
 select 2 ID, array(named_struct("city","San Jose","state","CA"),named_struct("city","San Diego","state","CA")) as my_array
)s
)

select s.ID, a.city, a.state
  from mydata s 
       lateral view inline(s.my_array) a as city, state

;

并根据需要再次将它们连接到字符串中,收集数组,concat_ws等