在reducer中合并文件时Hadoop作业失败

时间:2013-06-21 03:45:12

标签: java hadoop

在合并/复制过程中,我在reduce步骤中收到以下错误:

java.io.IOException: Task: attempt_201306130308_0177_r_000002_0 - The reduce copier failed
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:384)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
    at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.io.IOException: Intermediate merge failed
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2703)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2628)
Caused by: java.io.IOException: Rec# 13932: Failed to skip past record of length: 129
    at org.apache.hadoop.mapred.IFile$InMemoryReader.next(IFile.java:542)
    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
    at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2687)
    ... 1 more

如果,我在这个位置查看IFile.java中的代码,它看起来像这样:

long skipped = dataIn.skip(recordLength);
if (skipped != recordLength) {
     throw new IOException("Rec# " + recNo + ": Failed to skip past record of length: " + 
                             recordLength);
}

基本上,不知何故,下一条记录的内存中读取是不一致的。这怎么可能?是因为缓冲区不能完全适合内存,我需要增加内存吗?

有什么建议可以尝试下一步吗?

0 个答案:

没有答案