如何在多行中组合蜂巢中的地图?

时间:2017-10-24 22:35:00

标签: hive aggregate-functions

我有一个像

这样的hive表
string id,  map<string attr_name,double attr_value> attrs

但结构不是你所期望的。每个attr都有自己的行,如:

1   height:180.0
1   weight:76.0
2   height:170.0
2   weight:74

我怎样才能编写一个将这些内容折叠为

的hi​​ve查询
1   height:180,weight:76.0
2   height:170,weight:74.0

2 个答案:

答案 0 :(得分:0)

使用min()max() group by id汇总属性,使用分隔符汇编字符串并使用str_to_map()函数转换为地图:

select id, str_to_map(concat_ws(',',concat('height',':',cast(max(attrs['height']) as string)),concat('weight',':',cast(max(attrs['weight']) as string))),',',':') as attrs
from
(--this is your table data, just replace the subquery (s) with your table
 select 1 as id, map('height',180.0) as attrs union all
 select 1 as id, map('weight',76.0) as attrs union all
 select 2 as id, map('height',170.0) as attrs union all
 select 2 as id, map('weight',74.0) as attrs
 )s
 group by id;

OK
1       {"height":"180.0","weight":"76.0"}
2       {"height":"170.0","weight":"74.0"}
Time taken: 74.219 seconds, Fetched: 2 row(s)

添加更多属性并将子查询替换为您的表格;

答案 1 :(得分:0)

使用聚合函数collect_set。Source

select id, collect_set(attr)
from table
group by id;