我有一个以下格式的表
ID Property Value
1 name Tim
1 location USA
1 age 30
2 name Jack
2 location UK
2 age 27
我想要以下格式的输出
ID name location age
1 Tim USA 30
2 Jack UK 27
在python中我可以做
table_agg = table.groupby('ID')[['Property','Value']].apply(lambda x: dict(x.values))
p = pd.DataFrame(list(table_agg))
如何在Hive中编写查询?
答案 0 :(得分:2)
您可以使用collect_list,map函数对数据进行分组,然后根据密钥访问 array
。
示例:
hive> create table t1(id int,property string,valu string) stored as orc;
hive> insert into t1 values(1,"name","Tim"),(1,"location","USA"),(1,"age","30"),(2,"name","Jack"),(2,"location","UK"),(2,"age","27");
hive> select id,
va[0]["name"]name,
va[1]["location"]location,
va[2]["age"]age
from (
select id,collect_list(map(property,value))va
from <table_name> group by id
)t;
结果:
id name location age
1 Tim USA 30
2 Jack UK 27