我编写了一个调用用Python编写的UDF的Hive Query,但是我收到了这个错误:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"psdr_doc_id":790892632}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"psdr_doc_id":790892632}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
... 8 more
Caused by: or
我不认为它与我的Python函数有关,因为它在Hive之外工作。只要看一下错误,有人能告诉我可能出错的地方吗? 如果需要更多信息,请告诉我。
我的查询是:
hive -e "select TRANSFORM(psdr_doc_id) using 'get_date.py' as published_date from test_table"
get_date.py
打印出文章的发布日期,其ID为psdr_doc_id
)作为输入。
非常感谢任何有关这方面的帮助,谢谢!