Question

我使用map reduce运行Graph Traversal算法，并在不使用hadoop进行测试时提供所需的输出。但在运行命令时：

hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -file /home/hduser/finalmap.py -mapper 'python finalmap.py' -file /home/hduser/finalred.py -reducer 'python finalred.py' -input /Random_Walk_Input -output Random_Walk_Output1

发生以下情况：

16/01/27 11:03:51 INFO mapreduce.Job: map 0% reduce 0%
16/01/27 11:03:55 INFO mapreduce.Job: map 33% reduce 0%
16/01/27 11:04:02 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_1, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/01/27 11:04:03 INFO mapreduce.Job: map 50% reduce 0%

16/01/27 11:04:14 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_2, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/01/27 11:04:22 INFO mapreduce.Job: map 50% reduce 17%

16/01/27 11:04:25 INFO mapreduce.Job: map 100% reduce 100%

16/01/27 11:04:26 INFO mapreduce.Job: Job job_1453872707553_0001 failed with state FAILED due to: Task failed task_1453872707553_0001_m_000001

Job failed as tasks failed. failedMaps:1 failedReduces:0

16/01/27 11:04:27 INFO mapreduce.Job: Counters: 39 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=15725173 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=413787 HDFS: Number of bytes written=0 HDFS: Number of read operations=3 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=4 Killed reduce tasks=1 Launched map tasks=5 Launched reduce tasks=1 Other local map tasks=3 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=68482 Total time spent by all reduces in occupied slots (ms)=19382 Total time spent by all map tasks (ms)=68482 Total time spent by all reduce tasks (ms)=19382 Total vcore-seconds taken by all map tasks=68482 Total vcore-seconds taken by all reduce tasks=19382 Total megabyte-seconds taken by all map tasks=70125568 Total megabyte-seconds taken by all reduce tasks=19847168 Map-Reduce Framework Map input records=17666 Map output records=767145 Map output bytes=14081829 Map output materialized bytes=15616125 Input split bytes=91 Combine input records=0 Spilled Records=767145 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=229 CPU time spent (ms)=17120 Physical memory (bytes) snapshot=269684736 Virtual memory (bytes) snapshot=852369408 Total committed heap usage (bytes)=200802304 File Input Format Counters Bytes Read=413696 16/01/27 11:04:27 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed!

这是什么意思？它显示mapper和reducer已经执行了100％但是再次说失败的地图：1并且失败减少：0

Answer 1

确保您的流媒体jar版本和hadoop版本匹配（它们具有相同的版本号）这为我修复了错误!!

Map和Reduce每次执行100％但流式传输作业失败。蟒蛇

1 个答案: