我有一个简单的Python脚本
#!/usr/local/bin/python
import sys
import datetime
for line in sys.stdin:
line = line.strip()
fname , lname = line.split('\t')
l_name = lname.lower()
print '\t'.join([fname, str(l_name)])
Hive表数据如下所示:
Akash Gupta
Ashish Agarwal
Aarav Kedia
Rajesh Lakhia
Sunita Patel
Raj Dutta
Nadeem Siddiqui
表结构是:
hive> desc fullName;
OK
fname string
lname string
我将我的Python脚本添加为:
add FILE /full-path-to-the-script/convertToLowerCase.py;
现在,我正在为脚本运行Transform操作:
SELECT TRANSFORM(fname, lname) USING 'python convertToLowerCase.py' AS (fname, l_name) FROM fullName;
但是,Map Reduce作业会抛出错误:
FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script.
我做错了什么?
答案 0 :(得分:0)
Python代码有问题。
Indentation of the For Loop
。
解决了这个问题。