通过使用配置单元查询我可以将表数据转换为复杂类型列表<结构>吗?

时间:2015-12-31 07:02:31

标签: java hadoop hive

以下是表格详情

Id    Data  
a     {"col1":"11.0","col2":30.0}  
a     {"col1":"12.0","col2":10.0}  
b     {"col1":"11.0","col2":20.0}  
b     {"col1":"12.0","col2":25.0}  
b     {"col1":"15.0","col2":25.0}  
c     {"col1":"12.0","col2":15.0}  
c     {"col1":"13.0","col2":16.0}  

预期输出 - 按ID分组的数据结构列表。

ID  Data  
a   list[ {"col1":"11.0","col2":30.0},{"col1":"12.0","col2":10.0}]  
b   list[ {"col1":"11.0","col2":20.0},{"col1":"12.0","col2":25.0},{"col1":"15.0","col2":25.0}]  
c   list[ {"col1":"12.0","col2":15.0},{"col1":"13.0","col2":16.0}] 

是否有可能由HIVE支持的功能或需要编写任何用户定义功能。

1 个答案:

答案 0 :(得分:0)

简短的回答是肯定的,之前有答案,请看这里

How to get array/bag of elements from Hive group by operator?

但总结如果你只有你自己的独特元素,那么使用collect_set,否则使用collect_list(仅适用于hive 0.13+),除了它是一个标准的查询组。