Question

我正面临“Java堆空间错误”，当我试图通过将整个文件夹作为MR作业的输入来运行mapreduce程序时。当我将一个文件作为MR作业的输入时，我我没有遇到任何错误。这项工作已成功运作。

Changes I tried in hadoop-env.sh file:
=====================================
I had increased the memory size from 1024 to 2048MB
export HADOOP_CLIENT_OPTS="-Xmx2048m $HADOOP_CLIENT_OPTS"

Changes in mapred-site.xml:
===========================
<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx2048m</value>
</property>

通过对这些文件进行更改，我仍然面临“Java堆空间错误”。

任何人都可以就此问题向我提出建议......

Answer 1

您可以使用以下内容打开HPROF分析工作

conf.setBoolean("mapred.task.profile", true); conf.set("mapred.task.profile.params", "-agentlib:hprof=cpu=samples," + "heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s"); conf.set("mapred.task.profile.maps", "0-2"); conf.set("mapred.task.profile.reduces", "0-2");

这将帮助您诊断堆耗尽的内容。请参阅“Hadoop The Definitive Guide”第178-181页中的更多详细信息。“

当我试图将整个文件夹作为Mapreduce Program的输入时，我正面临“Java堆空间错误”

1 个答案: