Question

我正在使用分布式shell应用程序（hadoop-2.0.0-cdh4.1.2）。这是我目前收到的错误。

13/01/01 17:09:09 INFO distributedshell.Client: Got application report from ASM for, appId=5, clientToken=null, appDiagnostics=Application application_1357039792045_0005 failed 1 times due to AM Container for appattempt_1357039792045_0005_000001 exited with  exitCode: 143 due to: Container [pid=24845,containerID=container_1357039792045_0005_01_000001] is running beyond virtual memory limits. Current usage: 77.8mb of 512.0mb physical memory used; 1.1gb of 1.0gb virtual memory used. Killing container.
Dump of the process-tree for container_1357039792045_0005_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 24849 24845 24845 24845 (java) 165 12 1048494080 19590 /usr/java/bin/java -Xmx512m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 128 --num_containers 1 --priority 0 --shell_command ping --shell_args localhost --debug
|- 24845 23394 24845 24845 (bash) 0 0 108654592 315 /bin/bash -c /usr/java/bin/java -Xmx512m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 128 --num_containers 1 --priority 0 --shell_command ping --shell_args localhost --debug 1>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stdout 2>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stderr

有趣的是，设置似乎没有问题，因为简单的ls或uname命令成功完成，输出在container2标准输出中可用。

关于设置，yarn.nodenamager.vmem-pmem-ratio为3，可用的总物理内存为2GB，我认为这样就可以运行了。

对于有问题的命令，“ping localhost”生成了两个回复，从containerlogs/container_1357039792045_0005_01_000002/721917/stdout/?start=-4096.

可以看出

那么，可能是什么问题？

Answer 1

从错误消息中，您可以看到您使用的虚拟内存超过了当前的1.0gb限制。这可以通过两种方式解决：

禁用虚拟内存限制检查

YARN将完全忽略限制;为此，请将其添加到yarn-site.xml：

<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
  <description>Whether virtual memory limits will be enforced for containers.</description>
</property>

此设置的默认值为true。

将虚拟内存增加到物理内存比率

在yarn-site.xml中将此值更改为高于当前设置的值

<property>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>5</value>
  <description>Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.</description>
</property>

默认为2.1

您还可以增加分配给容器的物理内存量。

确保在更改配置后不要忘记重新启动纱线。

Answer 2

无需更改群集配置。我发现只提供了额外的参数

-Dmapreduce.map.memory.mb=4096

to distcp帮助了我。

Answer 3

如果您正在运行Tez框架，则必须在Tez-site.xml中设置以下参数

tez.am.resource.memory.mb
tez.task.resource.memory.mb
tez.am.java.opts

在Yarn-site.xml中

yarn.nodemanager.resource.memory-mb
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.vmem-check-enabled
yarn.nodemanager.vmem-pmem-ratio

所有这些参数都必须设置

Answer 4

您可以在yarn-site.xml

中将此值更改为高于默认值1 GB的值

yarn.app.mapreduce.am.resource.mb

Answer 5

在实践中，我已经看到对大型表或包含大量文件/小文件或未存储桶的表的表运行查询或查询大量分区时会发生此问题。

当Tez尝试计算需要生成多少个映射器时，会发生问题，并且在执行此计算时，由于默认值（1gb）太小，趋向于OOM。

解决此问题的方法是不将filename = raw_input("Copy/Paste the Filename here: ") + filetype设置为2gb或4gb。此外，还有一个非常重要的事情是，不能在配置单元查询中设置此设置，因为到那时为止为时已晚。 AM是第一个由yarn产生的容器，因此在蜂巢查询中设置该容器没有用。

需要在* -site.xml中进行设置，或者在生成蜂巢壳时进行设置，如下所示：

tez.am.resource.memory.mb

在上面的示例中，am发出信号来生成AM为2gb，而不是默认值。

参考：http://moi.vonos.net/bigdata/hive-cli-memory/

Answer 6

好的，发现了。将主内存参数增加到750MB以上，您将成功运行YARN应用程序。

AM Container超出了虚拟内存限制

6 个答案: