Question

我有一个Reduce工作，我收到上述错误，该文件只能被复制到0个节点而不是1.我在线搜索并发现它可能是数据节点的问题，但我正在运行此工作流程中的其他MapReduce作业都可以正常工作。我看到的唯一区别是我使用多个输出并指定一个文件夹，但我确信路径是正确的。这是多输出写行：

mos.write("mosName", new LongWritable(key), value, outputFilePath);

我得到的确切错误是：

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File xxx could 
only be replicated to 0 nodes instead of minReplication (=1).  There are 7 
datanode(s) running and no node(s) are excluded in this operation.

任何帮助都将不胜感激。

Answer 1

我遇到了同样的问题，当我将输出写入上下文而不是MultipleOutputs时，它没有复制。据我所知，这是因为MultipleOutputs在内存中存储的数据更长。

解决方案是：

的组合

（1）对输出执行压缩

FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);

（2）为我的工作提供更多内存（请记住，java.opts中的JVM内存必须至多占容器内存的80％）

-Dmapreduce.map.memory.mb=3072 -Dmapreduce.map.java.opts=-Xmx2048m

MapReduce多个输出：文件只能复制到0个节点，而不是1个

1 个答案: