我有一个HIVE
表格如下:
select id, id_2, val from test order by id;
234 974 0.5
234 457 0.7
234 236 0.5
234 859 0.6
123 859 0.7
123 236 0.6
123 974 0.5
123 457 0.5
我正在根据collect
值尝试id
数据。我需要收集的数据遵循每行的相同顺序。我的预期输出如下:(任何订单都很好,只要它对所有行都相同):
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [974,457,236,859] [0.5,0.5,0.6,0.7]
我使用了来自Brickhouse的collect
UDF。
select tmp.id, collect(id_2), collect(tmp.val) from
(select id, id_2, val from test
order by id) tmp
group by tmp.id
;
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [859,236,974,457] [0.7,0.6,0.5,0.5]
如您所见,未保留列的顺序。有没有办法在整个输出中保持排序不变?任何提示将不胜感激。
答案 0 :(得分:2)
使用此查询
select tmp.id, collect(id_2), collect(tmp.val) from
(select id, id_2, val from test
order by id desc, id_2 desc) tmp
group by tmp.id
;
输出如下,
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [974,457,236,859] [0.5,0.5,0.6,0.7]
基本修改
order by id
到
order by id desc, id_2 desc
答案 1 :(得分:-2)
使用sort_array(collect_list(cols))
;