使用Python的Hive UDF - 运行时错误

时间:2014-06-25 18:10:59

标签: python hadoop hive user-defined-functions

我编写了一个调用用Python编写的UDF的Hive Query,但是我收到了这个错误:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"psdr_doc_id":790892632}
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"psdr_doc_id":790892632}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    ... 8 more
Caused by: or

我不认为它与我的Python函数有关,因为它在Hive之外工作。只要看一下错误,有人能告诉我可能出错的地方吗? 如果需要更多信息,请告诉我。

我的查询是:

hive -e "select TRANSFORM(psdr_doc_id) using 'get_date.py' as published_date from test_table"

get_date.py打印出文章的发布日期,其ID为psdr_doc_id)作为输入。

非常感谢任何有关这方面的帮助,谢谢!

0 个答案:

没有答案