YARN上的Apache Ignite无法正常工作 - 节点容器日志不存在

时间:2017-09-25 17:00:26

标签: yarn hadoop2 ignite

我们一直在努力让Apache Ignite(2.1.0)按照本指南在YARN上运行(在HDP 2.6上运行):https://apacheignite.readme.io/docs/yarn-deployment (我们在离线环境中运行)

虽然我们启动了应用程序,但YARN AM容器中的stderr日志是我们唯一可以看到的日志(stdout为空),它们看起来像这样:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.3.4.0-3485/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/disk/09/hadoop/yarn/local/usercache/hongmei/appcache/application_1464374946035_32224/filecache/10/ignite-yarn.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/06/10 08:43:48 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
16/06/10 08:43:48 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
Jun 10, 2016 8:43:48 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Application master registered.
Jun 10, 2016 8:43:48 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 10, 2016 8:43:48 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
16/06/10 08:43:49 INFO impl.AMRMClientImpl: Received new token for : c5hdp105.c5.runwaynine.com:45454
16/06/10 08:43:49 INFO impl.AMRMClientImpl: Received new token for : c5hdp112.c5.runwaynine.com:45454
Jun 10, 2016 8:43:49 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 10, 2016 8:43:49 AM org.apache.ignite.yarn.ApplicationMaster run
INFO: Making request. Memory: 2,432, cpu 1.
Jun 10, 2016 8:43:49 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_32224_01_000002.
16/06/10 08:43:49 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp105.c5.runwaynine.com:45454
Jun 10, 2016 8:43:49 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated
INFO: Launching container: container_e24_1464374946035_32224_01_000003.
16/06/10 08:43:49 INFO impl.ContainerManagementProtocolProxy: Opening proxy : c5hdp112.c5.runwaynine.com:45454
16/06/10 08:43:50 INFO impl.AMRMClientImpl: Received new token for : c5hdp114.c5.runwaynine.com:45454
16/06/10 08:43:50 INFO impl.AMRMClientImpl: Received new token for : c5hdp115.c5.runwaynine.com:45454
Jun 10, 2016 8:43:50 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_32224_01_000006. State: COMPLETE.
Jun 10, 2016 8:43:50 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_32224_01_000007. State: COMPLETE.
Jun 10, 2016 8:43:50 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_32224_01_000004. State: COMPLETE.
Jun 10, 2016 8:43:50 AM org.apache.ignite.yarn.ApplicationMaster onContainersCompleted
INFO: Container completed. Container id: container_e24_1464374946035_32224_01_000005. State: COMPLETE.

上面的日志会一直持续(> 1天),基本上会发生的事情是YARN继续启动/终止容器以始终匹配我们在Ignite conf文件中提供的IGNITE_NODE_COUNT的数量。我也可以通过查看具有 TONS 的排序操作的rm-audit.log来确认:

..
..
OPERATION=AM Allocated Container .... RESULT=SUCCESS <containerid1>
OPERATION=AM Allocated Container .... RESULT=SUCCESS <containerid2>
OPERATION=AM Allocated Container .... RESULT=SUCCESS <containerid3>
OPERATION=AM Released Container .... RESULT=SUCCESS <containerid1>
OPERATION=AM Released Container .... RESULT=SUCCESS <containerid2>
OPERATION=AM Released Container .... RESULT=SUCCESS <containerid3>
..
..

当我尝试访问每个容器的日志时,YARN没有向我显示那些带有Logs for container <container_id> are not present in this log-file消息的容器。我尝试过使用YARN UI并使用yarn logs -applicationId <application_id> -containerId <contaienrId>

我尝试将yarn.log-aggregation-enable属性设置为truefalse,但即使在本地文件系统中也不会生成日志。我得到的所有日志都来自AM容器,不断重复上面显示的相同消息。

如何为其他容器启用日志?没有那些,我无法分辨我的Ignite设置有什么问题。请帮忙。

cluster.properties 文件(点燃)

IGNITE_NODE_COUNT=3
IGNITE_RUN_CPU_PER_NODE=2
IGNITE_MEMORY_PER_NODE=1024
IGNITE_XML_CONFIG=/data/ignite/<xml_name>.xml
IGNITE_WORK_DIR=/data/ignite/work
IGNITE_RELEASES_DIR=/data/ignite/releases
IGNITE_VERSION=2.1.0
IGNITE_PATH=/data/ignite/<path_to_ignite>.zip

0 个答案:

没有答案