我正在编写一个基于split()GenericUDF的Hive UDF。我的函数split_int()接受字符串参数split_int(str,regex)
,例如split_int("1,2,3",",")
。我想返回array<int>
,而是获得array<string>
代码:
public class SplitIntUDFSplit extends GenericUDF {
private ObjectInspectorConverters.Converter[] converters;
@Override
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
if (arguments.length != 2) {
throw new UDFArgumentLengthException(
"The function SPLIT(s, regexp) takes exactly 2 arguments.");
}
converters = new ObjectInspectorConverters.Converter[arguments.length];
for (int i = 0; i < arguments.length; i++) {
converters[i] = ObjectInspectorConverters.getConverter(arguments[i],
PrimitiveObjectInspectorFactory.writableStringObjectInspector);
}
return ObjectInspectorFactory.
getStandardListObjectInspector(PrimitiveObjectInspectorFactory.writableIntObjectInspector);
}
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException {
assert (arguments.length == 2);
if (arguments[0].get() == null || arguments[1].get() == null) {
return null;
}
Text s = (Text) converters[0].convert(arguments[0].get());
Text regex = (Text) converters[1].convert(arguments[1].get());
ArrayList<IntWritable> result = new ArrayList<IntWritable>();
for (String str : s.toString().split(regex.toString())) {
result.add(new IntWritable(Integer.parseInt(str)));
}
return result;
}
如何让udf返回array<int>
?