我已经设置了一个Hadoop 2.7.5.HA集群,并使用默认的YARN队列运行Flink 1.4.0应用程序。我决定对应用程序进行分类并在独占节点管理器上运行它们,因此我在队列4 core
中标记了三个节点,每个节点2GB RAM
和stream
为streamQ
,每个节点{{1}在队列1 core
中将1GB RAM
设置为online
,并根据需要在YARN webUI中显示所有设置,并标识节点。
这是onlineQ
:
capacity-scheduler.xml
我运行命令在边缘节点上启动Flink会话,所有hadoop配置与集群相同:
<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.1</value>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>
<property>
<name>yarn.scheduler.capacity.node-locality-delay</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings</name>
<value></value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>streamQ,onlineQ</value>
</property>
<!-- streamQ settings -->
<property>
<name>yarn.scheduler.capacity.root.streamQ.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.acl_submit_applications</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.acl_administer_queue</name>
<value>*</value>
</property>
<!-- onlineQ settings -->
<property>
<name>yarn.scheduler.capacity.root.onlineQ.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.acl_submit_applications</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.acl_administer_queue</name>
<value>*</value>
</property>
它成功上传了HDFS上的Flink库和YARN webUI我可以看到该应用程序,但是当它试图获取资源时,它说:
yarn-session.sh -n 2 -jm 768 -tm 768 -nm flink -z flink_zoo -s 3 -qu streamQ
以下是整个日志:
018-01-28 10:02:04,087 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster
有什么问题?
答案 0 :(得分:0)
编辑capacity-scheduler.xml
,解决了问题:
<!-- configuration of queue-root -->
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>streamQ,onlineQ</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.default-node-label-expression</name>
<value>*</value>
</property>
<!-- configuration of queue-streamQ -->
<property>
<name>yarn.scheduler.capacity.root.streamQ.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name>
<value>stream</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.online.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name>
<value>stream</value>
</property>
<!-- configuration of queue-streamQ -->
<property>
<name>yarn.scheduler.capacity.root.onlineQ.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name>
<value>online</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.stream.capacity</name>
<value>0</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name>
<value>online</value>
</property>
</configuration>
答案 1 :(得分:-1)
请检查您的flink应用日志,看看连接到yarn resourcemanager时是否存在问题。当我用HA在纱线上使用flink时,我也解决了这个问题。我不确定我是否是唯一一个。