在Amazon EMR,Hive 0.11上运行,我正在尝试使用GenericUDF类创建一个简单的UDF。我试图用UDF做的只是从列中获取一个值,然后将其打印回屏幕。重点是看看我是否可以在构建更复杂的东西之前让这个工作。
我编译jar,加载到hive,并创建一个临时函数。
add jar ..../GenericTest.jar;
create temporary function gen_test as 'GenericTest';
当我使用错误数量的参数运行函数时,我得到了预期的错误:
SemanticException [Error 10015]: Line 1:13 Arguments length mismatch 'gen_test': Wrong # of Args
然而,当我传递正确数量的参数时,它立即失败并显示消息:
FAILED: RuntimeException typeInfo cannot be null!
到目前为止,我一直无法找到这个问题的根源。这个UDF的代码如下。
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils;
import org.apache.hadoop.hive.serde2. objectinspector.ObjectInspector;
public class GenericTest extends GenericUDF {
private GenericUDFUtils.ReturnObjectInspectorResolver returnOIResolver;
private ObjectInspector[] argumentOIs;
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
argumentOIs = arguments;
if (arguments.length != 1) {
throw new UDFArgumentLengthException("Wrong # of Args");
}
if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE)
throw new UDFArgumentTypeException(0, "Only primitive type arguments are accepted");
returnOIResolver = new GenericUDFUtils.ReturnObjectInspectorResolver(true);
return returnOIResolver.get();
}
public Object evaluate(DeferredObject[] arguments) throws HiveException {
Object retVal = returnOIResolver.convertIfNecessary(arguments[0].get(), argumentOIs[0]);
return retVal;
}
public String getDisplayString(String[] children){
String rt = "get Display String test";
return rt;
}
}
答案 0 :(得分:1)
如果你想尝试基本的:你可以使用这个
package yarn;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
public class GenericUDFNvl extends GenericUDF {
private GenericUDFUtils.ReturnObjectInspectorResolver returnOIResolver;
private ObjectInspector[] argumentOIs;
@Override
public ObjectInspector initialize(ObjectInspector[] arguments)
throws UDFArgumentException {
argumentOIs = arguments;
if (arguments.length != 2) {
throw new UDFArgumentLengthException(
"The operator 'NVL' accepts 2 arguments.");
}
returnOIResolver = new GenericUDFUtils.ReturnObjectInspectorResolver(true);
if (!(returnOIResolver.update(arguments[0]) && returnOIResolver
.update(arguments[1]))) {
throw new UDFArgumentTypeException(2,
"The 1st and 2nd args of function NLV should have the same type, "
+ "but they are different: \"" + arguments[0].getTypeName()
+ "\" and \"" + arguments[1].getTypeName() + "\"");
}
return returnOIResolver.get();
}
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException {
// TODO Auto-generated method stub
Object retVal = returnOIResolver.convertIfNecessary(arguments[0].get(),
argumentOIs[0]);
if (retVal == null ){
retVal = returnOIResolver.convertIfNecessary(arguments[1].get(),
argumentOIs[1]);
}
return retVal;
}
@Override
public String getDisplayString(String[] children) {
StringBuilder sb = new StringBuilder();
sb.append("if ");
sb.append(children[0]);
sb.append(" is null ");
sb.append("returns");
sb.append(children[1]);
return sb.toString() ;
}
public static void main(String[] args) {
}
}
如果第一个参数不为null,则必须传递2个参数,然后它将打印第一个参数,如果第一个参数为null,则它将打印第二个参数
select nvl(movie_title,"test") from u_item_test1;
如果movie_tittle在那里那么movie_tittle,如果没有那么测试将被打印
答案 1 :(得分:0)
我已经让它运行了。
在initialize()中,我需要类似于returnOIResolver.update(arguments[0]);
的东西(在第一个答案中显示),以便返回returnOIResolver.get();
将返回一些东西(返回值的ObjectInspector)。