更新配置单元UDF的jar

时间:2018-04-13 09:28:58

标签: hive hive-udf

TL; DR:如何在hive中更新自定义UDF的jar?

我写了自己的(通用)udf,工作得非常好。我可以定义一个新函数并将其与命令一起使用:

CREATE FUNCTION myfunc AS 'io.company.hive.udf.myfunc' USING JAR 'hdfs:///myudfs/myfunc.jar';

现在我想更新我的udf,因此我想在hdfs中使用相同名称的jar的更新版本。之后,会发生什么:

  • 首先调用该函数,提供java.io.IOException: Previous writer likely failed to write hdfs://ip-10-0-10-xxx.eu-west-1.compute.internal:8020/tmp/hive/hive/_tez_session_dir/0de6055d-190d-41ee-9acb-c6b402969940/myfunc.jar Failing because I am unlikely to write too.
  • 第二次通话给出org.apache.hadoop.hive.ql.metadata.HiveException: Default queue should always be returned.Hence we should not be here.

日志文件显示:

Localizing resource because it does not exist: file:/tmp/8f45f1b7-2850-4fdc-b07e-0b53b3ddf5de_resources/myfunc.jar to dest: hdfs://ip-10-0-10-129.eu-west-1.
compute.internal:8020/tmp/hive/hive/_tez_session_dir/994ad52c-4b38-4ee2-92e9-67076afbbf10/myfunc.jar
tez.DagUtils (DagUtils.java:localizeResource(961)) - Looks like another thread is writing the same file will wait.
tez.DagUtils (DagUtils.java:localizeResource(968)) - Number of wait attempts: 5. Wait interval: 5000
tez.DagUtils (DagUtils.java:localizeResource(984)) - Could not find the jar that was being uploaded

我已经尝试过:

  • 将jar添加到hive.reloadable.aux.jars.pathhive.aux.jar.path
  • list jar / delete jar / create function / reload的不同组合无济于事。

我甚至最终有一个查询开始显然然后只是挂起,没有前进,日志中没有任何内容,也没有创建DAG。

INFO  : converting to local hdfs:///hive-udf-wp/hive-udf-wp.jar
INFO  : Added [/tmp/19e0c9fc-9c7c-4de5-a034-ced062f87f64_resources/hive-udf-wp.jar] to class path
INFO  : Added resources: [hdfs:///hive-udf-wp/hive-udf-wp.jar]

我认为要求tex不重用当前会话可以解决这个问题,因为那时会创建新的会话,而不会使用旧版本的jar。这会是一种选择吗?

1 个答案:

答案 0 :(得分:0)

我知道处理此问题的唯一方法是重新启动配置单元。
(我仍在寻找更新udf的好方法。)