Hadoop Shuffle失败

时间:2018-09-13 06:27:30

标签: mapreduce hdfs hadoop2

我正在运行一个字数统计程序。

hadoop jar hadoop-mapreduce-examples-2.4.0.jar wordcount /Small_shakespeare_punctuators_removed.raw /output.txt

完成地图后,随机播放部分失败,并且出现以下错误:

ERROR:-
18/09/13 11:23:54 INFO mapreduce.Job:  map 100% reduce 0%
18/09/13 11:23:58 INFO mapreduce.Job: Task Id : attempt_1536767235788_0003_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#5
        at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
        at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
        at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

检查映射的日志后,我得到了:

2018-09-13 11:25:25,975 INFO org.apache.hadoop.mapreduce.jobhistory.JobSummary: jobId=job_1536767235788_0003,submitTime=1536818023943,launchTime=1536818028069,firstMapTaskLaunchTime=1536818030260,firstReduceTaskLaunchTime=1536818035174,finishTime=1536818052423,resourcesPerMap=2048,resourcesPerReduce=4096,numMaps=1,numReduces=1,user=hadoop,queue=default,status=FAILED,mapSlotSeconds=10,reduceSlotSeconds=36,jobName=word count

2018-09-13 11:25:25,980 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving hdfs://mycluster:8020/tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1536767235788_0003-1536818023943-hadoop-word+count-1536818052423-1-0-FAILED-default-1536818028069.jhist to hdfs://mycluster:8020/tmp/hadoop-yarn/staging/history/done/2018/09/13/000000/job_1536767235788_0003-1536818023943-hadoop-word+count-1536818052423-1-0-FAILED-default-1536818028069.jhist

从此日志中,我可以看出该文件无法从HDFS内的一个文件夹移动到另一个文件夹,但是我仍然卡住。

即使在设置属性后:

  1. mapreduce.reduce.shuffle.input.buffer.percent = 0.7
  2. mapreduce.reduce.shuffle.parallelcopies = 4

错误仍然存​​在。

编辑:-

我能够解决问题。这是由于以下属性:

mapred.map.output.compression.codec-> org.apache.hadoop.io.compress.GzipCodec

此属性正在影响mapreduce随机播放,因为它无法将文件从done_intermediate文件夹移至done。

0 个答案:

没有答案