Hive UDF - Java String castexception

时间:2015-11-01 12:05:18

标签: java hadoop hive udf

我编写了UDF来解码cookie并返回字符串List。 不幸的是,我在处理时遇到了Hive Runtime错误

这是我的代码:

@Override
public ObjectInspector initialize(ObjectInspector[] input) throws UDFArgumentException {

    ObjectInspector cookieContent = input[0];
    if (!(isStringOI(cookieContent))){
        throw new UDFArgumentException("only string");
    }
    this.cookieValue = (StringObjectInspector) cookieContent;
    return ObjectInspectorFactory.getStandardListObjectInspector
            (PrimitiveObjectInspectorFactory.javaStringObjectInspector);
}


public Object evaluate(DeferredObject[] input) throws HiveException {

    String encoded = cookieValue.getPrimitiveJavaObject(input[0].get());
    try {
        result = decode(encoded);
    } catch (CodeException e) {
        throw new UDFArgumentException();
    }

    return result;
}
public List<String> decode(String encoded) throws CodeException {

    decodedBase64 = Base64.decodeBase64(encoded);
    String decompressedArray = new String(getKadrs(decodedBase64));
    String kadr= decompressedArray.substring(decompressedArray.indexOf("|") + 1);
    List<String> kadrsList= new ArrayList(Arrays.asList(kadr.split(",")));
    return kadrsList;
}

private byte[] getKadrs(byte[] compressed) throws CodeException {
    Inflater decompressor = new Inflater();
    decompressor.setInput(compressed);
    ByteArrayOutputStream outPutStream = new ByteArrayOutputStream(compressed.length);
    byte temp [] = new byte[1024];
    while (!decompressor.finished()) {
        try {
            int count = decompressor.inflate(temp);
            outPutStream.write(temp, 0, count);
        }
        catch (DataFormatException e) {
            throw new CodeException ("Wrong data format", e);
        }
    }
    try {
        outPutStream.close();
    } catch (IOException e) {
        throw new CodeException ("Cant close outPutStream ", e);
    }
    return outPutStream.toByteArray();
}

结果是,让我们说:

&#34; kadr1,kadr20,kadr35,kadr12&#34;。单元测试工作正常,但当我试图在蜂巢中使用此功能时,我得到了这个:

   Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.hadoop.io.Text
  at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(WritableStringObjectInspector.java:41)

我很难调试因为其他人必须实现我的jar才能看到结果,所以所有的建议都会受到赞赏。

2 个答案:

答案 0 :(得分:0)

您的evaluate方法当前返回String,这不是Hadoop数据类型。您应该通过说Text来将字符串包装在return new Text(result)对象中。

答案 1 :(得分:0)

ravindra是对的

我初始化了        返回ObjectInspectorFactory.getStandardListObjectInspector             (PrimitiveObjectInspectorFactory.writableStringObjectInspector);

和WritableStringObjectInspector返回Text

我将其更改为javaStringObjectInspector,它返回String并且一切正常 感谢