如何在Hive中的不同行中的数组中对具有相同索引的元素求和

时间:2018-05-14 02:27:18

标签: hadoop hive

我将通过一个例子解释我在Hive中需要做些什么。 我收到两行: 第一行有这样的数组(1,3,6,7) 第二排(3,6,7,1) 结果我需要(4,9,13,8)

所以,我需要将所有行的所有数组的第一个索引的所有元素加在一起,并且与秒索引相同,等等......

1 个答案:

答案 0 :(得分:1)

基础表:

hive> select values from t1;
1,3,6,7
3,6,7,1

以职位爆炸

hive> select pos,value from t1 lateral view posexplode (split(values,",")) a as pos, value;
0   3
1   6
2   7
3   1
0   1
1   3
2   6
3   7

按位置求和

hive> select pos,sum(cast(value as int)) from t1 lateral view posexplode (split(values,",")) a as pos, value group by pos;
0   4
1   9
2   13
3   8

将总和值收集为列表

hive> select collect_list(sumvalue) from (select sum(cast(value as int)) as sumvalue from t1 lateral view posexplode (split(values,",")) a as pos, value group by pos)s;
[4,9,13,8]