在Hive中合并整数数组

时间:2018-02-28 21:50:07

标签: hive hiveql

foo_ids是类型bigint的数组,但整个数组可以为null。如果数组为null,我想要一个空数组。

如果我这样做:COALESCE(foo_ids, ARRAY())

我得到: FAILED: SemanticException [Error 10016]: Line 13:45 Argument type mismatch 'ARRAY': The expressions after COALESCE should all have the same type: "array<bigint>" is expected but "array<string>" is found

如果我这样做:COALESCE(foo_ids, ARRAY<BIGINT>())

我收到语法错误:FAILED: ParseException line 13:59 cannot recognize input near ')' ')' 'AS' in expression specification

这里的正确语法是什么?

1 个答案:

答案 0 :(得分:3)

使用这个:

coalesce(foo_ids,array(cast(null as bigint)))

之前,hive将空数组[]视为[]。但是在Hadoop2中,hive现在将空数组[]显示为null(参见下面的refence)。对于bigint类型的空数组使用数组(cast(null as bigint))。奇怪的是,空数组的大小是-1(而不是0)。希望这可以帮助。感谢。

Sample data:
foo_ids 
[112345677899098765,1123456778990987633]        
[null,null]     
NULL    

select foo_ids, size(foo_ids) as sz from tbl;
Result:
foo_ids                                        sz
[112345677899098765,1123456778990987633]        2
[null,null]                                     2
NULL                                           -1

select foo_ids, coalesce(foo_ids, array(cast(null as bigint))) as newfoo from tbl;
Result:
foo_ids                                         newfoo
[112345677899098765,1123456778990987633]        [112345677899098765,1123456778990987633]
[null,null]                                     [null,null] 
NULL                                            NULL

参考:https://docs.treasuredata.com/articles/hive-change-201602