使用python

时间:2016-09-11 13:20:06

标签: python hive

我在python中编写了一个简单的hive udf,但是当我在hive shell中运行它时,它会抛出错误:

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:514)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
        ... 8 more


FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script.

`

在hive shell中使用的命令:

add file /path/to/mycode.py;

创建了表格;

加载数据;

SELECT TRANSFORM (fname,lname) USING 'python mycode.py' AS (fname,LNAME) from table;

Hadoop版本:2.7

Hive:0.13

mycode.py

#!/usr/bin/python
import sys
import string

try:
    for line in sys.stdin:
        lines = string.strip(line,'\n')
        fname,lname = string.split(lines,',')
        #print (fname,lname)
        LNAME = lname.lower()
        #print LNAME
        print([fname,LNAME])
except:
    print sys.exc_info()

错误:

(<type 'exceptions.ValueError'>, ValueError('need more than 1 value to unpack',), <traceback object at 0x7f3441c14050>) NULL

但是,当我尝试cat pyinp.txt | python mycode.py时,它会提供所需的输出。

有人可以帮我解决这个问题吗?

1 个答案:

答案 0 :(得分:2)

这解决了像你这样的其他案件的问题:here

他们在函数中使用了try catch块,你也这样做了吗?