Question

当我运行hadoop streaming命令时，我的python mapper和reducer代码正常运行

hadoop fs -cat /user/root/myinput/testfile3_node.csv | ./mapper_1.py | sort | ./reducer_1.py

当我使用hadoop streaming命令运行代码时，它失败

hadoop jar /usr/iop/current/hadoop-mapreduce-client/hadoop-streaming.jar -mapper ./mapper_1.py -reducer ./reducer_1.py -file ./mapper_1.py -file ./reducer_1.py -input /user/root/myinput/testfile3.csv -output /user/root/myoutput/indexing_output1

输出：

Screenshot of simple command_running. Screenshot of Hadoop streaming jar command.

Answer 1

在-mapper和-reducer参数上尝试不使用./（确保你在正确的目录中）并且也不需要使用-files：

hadoop  jar /usr/iop/current/hadoop-mapreduce-client/hadoop-streaming.jar \
    -mapper mapper_1.py \
    -reducer reducer_1.py \
-input /user/root/myinput/testfile3.csv -output /user/root/myoutput/indexing_output1

以下是Apache Hadoop文档：

https://hadoop.apache.org/docs/r1.2.1/streaming.html

使用hadoop流命令运行时，map和reduce失败

1 个答案: