使用hadoop流命令运行时,map和reduce失败

时间:2016-12-12 23:44:51

标签: python hadoop

当我运行hadoop streaming命令时,我的python mapper和reducer代码正常运行

hadoop fs -cat /user/root/myinput/testfile3_node.csv | ./mapper_1.py | sort | ./reducer_1.py

当我使用hadoop streaming命令运行代码时,它失败

hadoop jar /usr/iop/current/hadoop-mapreduce-client/hadoop-streaming.jar -mapper ./mapper_1.py -reducer ./reducer_1.py -file ./mapper_1.py -file ./reducer_1.py -input /user/root/myinput/testfile3.csv -output /user/root/myoutput/indexing_output1

输出:

Screenshot of simple command_running. Screenshot of Hadoop streaming jar command.

1 个答案:

答案 0 :(得分:0)

在-mapper和-reducer参数上尝试不使用./(确保你在正确的目录中)并且也不需要使用-files:

hadoop  jar /usr/iop/current/hadoop-mapreduce-client/hadoop-streaming.jar \
    -mapper mapper_1.py \
    -reducer reducer_1.py \
-input /user/root/myinput/testfile3.csv -output /user/root/myoutput/indexing_output1   

以下是Apache Hadoop文档:

https://hadoop.apache.org/docs/r1.2.1/streaming.html

相关问题