java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:486)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
... 8 more
HQL脚本如下:
SELECT
TRANSFORM (userid, movieid, rating)
USING 'python /home/daxingyu930/test_data_mapper2.py'
AS userid, movieid, rating
;
python脚本非常简单,使用\t
来分割行。
我已经使用follow shell脚本测试了Linux中的python脚本:
cat test_data/u_data.txt | python test_data_mapper2.py
请求给我一些关于这个问题的想法,它让我发疯,让我无法入睡。 非常感谢。
答案 0 :(得分:5)
在使用自定义脚本之前,您应该将脚本添加到分布式缓存中。
例如。
add file /home/daxingyu930/test_data_mapper2.py;
SELECT
TRANSFORM (userid, movieid, rating)
USING 'python test_data_mapper2.py'
AS userid, movieid, rating
;
答案 1 :(得分:1)
chmod + x test_data_mapper2.py。
然后从hiveCl1运行以下命令 添加文件/home/daxingyu930/test_data_mapper2.py;
答案 2 :(得分:0)
您不应在USING子句中提供脚本的完整路径。只需使用python脚本(.py)名称。