地图效果很好,但Reduce失败了

时间:2012-06-15 02:38:57

标签: hadoop mapreduce

我运行一个简单的排序程序,但是,我遇到如下错误。

12/06/15 01:13:17 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: _http://192.168.1.106:50060/tasklog?plaintext=true&attemptid=attempt_201206150102_0002_m_000001_1&filter=stdout
12/06/15 01:13:18 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: _http://192.168.1.106:50060/tasklog?plaintext=true&attemptid=attempt_201206150102_0002_m_000001_1&filter=stderr
12/06/15 01:13:20 INFO mapred.JobClient:  map 50% reduce 0%
12/06/15 01:13:23 INFO mapred.JobClient:  map 100% reduce 0%
12/06/15 01:14:19 INFO mapred.JobClient: Task Id : attempt_201206150102_0002_m_000000_2, Status : FAILED
Too many fetch-failures
12/06/15 01:14:20 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: _http://192.168.1.106:50060/tasklog?plaintext=true&attemptid=attempt_201206150102_0002_m_000000_2&filter=stdout

有谁知道原因以及如何解决?

-------更新更多日志信息-------------------

2012-06-15 19:56:07,039 WARN org.apache.hadoop.util.NativeCodeLoader:无法为您的平台加载native-hadoop库...使用适用的builtin-java类 2012-06-15 19:56:07,258 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:源名称ugi已经存在! 2012-06-15 19:56:07,339 INFO org.apache.hadoop.mapred.Task:使用ResourceCalculatorPlugin:null 2012-06-15 19:56:07,346 INFO org.apache.hadoop.mapred.ReduceTask:ShuffleRamManager:MemoryLimit = 144965632,MaxSingleShuffleLimit = 36241408 2012-06-15 19:56:07,351 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0线程已启动:用于合并磁盘文件的线程 2012-06-15 19:56:07,351 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0线程已启动:用于合并内存文件的线程 2012-06-15 19:56:07,351 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0线程等待:用于合并磁盘文件的线程 2012-06-15 19:56:07,352 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0需要另外2个地图输出,其中0已在进行中 2012-06-15 19:56:07,352 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0线程已启动:用于轮询地图完成事件的线程 2012-06-15 19:56:07,352 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0预定的0输出(0个慢主机和0个重复主机) 2012-06-15 19:56:12,353 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0预定的1个输出(0个慢主机和0个重复主机) 2012-06-15 19:56:32,076 WARN org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0 copy failed:attempt_201206151954_0001_m_000000_0 from 192.168.1.106 2012-06-15 19:56:32,077 WARN org.apache.hadoop.mapred.ReduceTask:java.io.IOException:服务器返回HTTP响应代码:403为URL:_http://192.168.1.106:50060 / mapOutput?job = job_201206151954_0001&安培; MAP = attempt_201206151954_0001_m_000000_0&安培;减少= 0     at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getInputStream(ReduceTask.java:1639)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.setupSecureConnection(ReduceTask.java:1575)     at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getMapOutput(ReduceTask.java:1483)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.copyOutput(ReduceTask.java:1394)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.run(ReduceTask.java:1326)

2012-06-15 19:56:32,077 INFO org.apache.hadoop.mapred.ReduceTask:任务尝试_201206151954_0001_r_000000_0:尝试从attempt_201206151954_0001_m_000000_0获取#1失败 2012-06-15 19:56:32,077 INFO org.apache.hadoop.mapred.ReduceTask:即使在MAX_FETCH_RETRIES_PER_MAP重试后,也无法从attempt_201206151954_0001_m_000000_0获取地图输出...或者是读取错误,向JobTracker报告 2012-06-15 19:56:32,077 WARN org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0将主机192.168.1.106添加到惩罚框,12秒内接下来的联系 2012-06-15 19:56:32,077 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0:从以前的失败中获得1个地图输出 2012-06-15 19:56:47,080 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0预定的1个输出(0个慢主机和0个重复主机) 2012-06-15 19:56:56,048 WARN org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0 copy failed:attempt_201206151954_0001_m_000000_0 from 192.168.1.106 2012-06-15 19:56:56,049 WARN org.apache.hadoop.mapred.ReduceTask:java.io.IOException:服务器返回HTTP响应代码:403 for URL:_http://192.168.1.106:50060 / mapOutput?job = job_201206151954_0001&安培; MAP = attempt_201206151954_0001_m_000000_0&安培;减少= 0     at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getInputStream(ReduceTask.java:1639)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.setupSecureConnection(ReduceTask.java:1575)     at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getMapOutput(ReduceTask.java:1483)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.copyOutput(ReduceTask.java:1394)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.run(ReduceTask.java:1326)

2012-06-15 19:56:56,049 INFO org.apache.hadoop.mapred.ReduceTask:任务尝试_201206151954_0001_r_000000_0:从attempt_201206151954_0001_m_000000_0获取#2失败 2012-06-15 19:56:56,049 INFO org.apache.hadoop.mapred.ReduceTask:即使在MAX_FETCH_RETRIES_PER_MAP重试后,也无法从attempt_201206151954_0001_m_000000_0获取地图输出...或者是读取错误,向JobTracker报告 2012-06-15 19:56:56,049 WARN org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0将主机192.168.1.106添加到惩罚框,下次联系时间为16秒 2012-06-15 19:56:56,049 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0:从以前的失败中得到1个地图输出 2012-06-15 19:57:11,053 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0需要另外2个地图输出,其中0已在进行中 2012-06-15 19:57:11,053 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0预定0输出(1个慢主机和0个重复主机) 2012-06-15 19:57:11,053 INFO org.apache.hadoop.mapred.ReduceTask:惩罚(慢)主机: 2012-06-15 19:57:11,053 INFO org.apache.hadoop.mapred.ReduceTask:192.168.1.106将在1秒后考虑。 2012-06-15 19:57:16,055 INFO org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0预定的1个输出(0个慢主机和0个重复主机) 2012-06-15 19:57:25,984 WARN org.apache.hadoop.mapred.ReduceTask:attempt_201206151954_0001_r_000000_0 copy failed:attempt_201206151954_0001_m_000000_0 from 192.168.1.106 2012-06-15 19:57:25,984 WARN org.apache.hadoop.mapred.ReduceTask:java.io.IOException:服务器返回HTTP响应代码:403 for URL:_http://192.168.1.106:50060 / mapOutput?job = job_201206151954_0001&安培; MAP = attempt_201206151954_0001_m_000000_0&安培;减少= 0     at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getInputStream(ReduceTask.java:1639)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.setupSecureConnection(ReduceTask.java:1575)     at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getMapOutput(ReduceTask.java:1483)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.copyOutput(ReduceTask.java:1394)     在org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.run(ReduceTask.java:1326)

最诚挚的问候,

1 个答案:

答案 0 :(得分:4)

我有同样的问题。在深入挖掘之后,我将问题确定为主机的名称解析。请查看

中特定尝试的日志
$HADOOP_HOME/logs/userlogs/JobXXX/attemptXXX/syslog

如果它有类似

的内容
  

WARN org.apache.hadoop.mapred.ReduceTask:   java.net.UnknownHostException:slave-1.local.lan

然后只需在/ etc / hosts中添加相应的条目。执行此操作后,错误得到解决,在下一次尝试中一切正常。