关于Labeled YARN请求资源的Flink会话不可用

时间:2018-01-28 06:33:56

标签: hadoop yarn apache-flink

我已经设置了一个Hadoop 2.7.5.HA集群,并使用默认的YARN队列运行Flink 1.4.0应用程序。我决定对应用程序进行分类并在独占节点管理器上运行它们,因此我在队列4 core中标记了三个节点,每个节点2GB RAMstreamstreamQ,每个节点{{1}在队列1 core中将1GB RAM设置为online,并根据需要在YARN webUI中显示所有设置,并标识节点。 这是onlineQ

capacity-scheduler.xml

我运行命令在边缘节点上启动Flink会话,所有hadoop配置与集群相同:

<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
</property>

<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.1</value>
</property>

<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>

<property>
<name>yarn.scheduler.capacity.node-locality-delay</name>
<value>40</value>
</property>

<property>
<name>yarn.scheduler.capacity.queue-mappings</name>
<value></value>
</property>

<property>
<name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
<value>false</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>streamQ,onlineQ</value>
</property>

<!-- streamQ settings -->

<property>
<name>yarn.scheduler.capacity.root.streamQ.capacity</name>
<value>0</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name>
<value>stream</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name>
<value>stream</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.user-limit-factor</name>
<value>1</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.state</name>
<value>RUNNING</value>
</property>

<property>


<name>yarn.scheduler.capacity.root.streamQ.acl_submit_applications</name>
<value>*</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.streamQ.acl_administer_queue</name>
<value>*</value>
</property>

<!-- onlineQ settings -->

<property>
<name>yarn.scheduler.capacity.root.onlineQ.capacity</name>
<value>0</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name>
<value>online</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name>
<value>online</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.user-limit-factor</name>
<value>1</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.state</name>
<value>RUNNING</value>
</property>

<property>
 <name>yarn.scheduler.capacity.root.onlineQ.acl_submit_applications</name>
<value>*</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.onlineQ.acl_administer_queue</name>
<value>*</value>
</property>

它成功上传了HDFS上的Flink库和YARN webUI我可以看到该应用程序,但是当它试图获取资源时,它说:

yarn-session.sh -n 2 -jm 768 -tm 768 -nm flink -z flink_zoo -s 3 -qu streamQ

以下是整个日志:

018-01-28 10:02:04,087 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster

有什么问题?

2 个答案:

答案 0 :(得分:0)

编辑capacity-scheduler.xml,解决了问题:

<!-- configuration of queue-root -->


<property> 
  <name>yarn.scheduler.capacity.root.queues</name> 
  <value>streamQ,onlineQ</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.accessible-node-labels</name> 
  <value>*</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.accessible-node-labels.stream.capacity</name> 
  <value>100</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.accessible-node-labels.online.capacity</name> 
  <value>100</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.default-node-label-expression</name> 
  <value>*</value> 
</property>


 <!-- configuration of queue-streamQ -->


<property> 
  <name>yarn.scheduler.capacity.root.streamQ.capacity</name> 
  <value>50</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.streamQ.maximum-capacity</name> 
  <value>100</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels</name> 
  <value>stream</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.stream.capacity</name> 
  <value>100</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.streamQ.accessible-node-labels.online.capacity</name> 
  <value>0</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.streamQ.default-node-label-expression</name> 
  <value>stream</value> 
</property>


<!-- configuration of queue-streamQ -->


<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.capacity</name> 
  <value>50</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.maximum-capacity</name> 
  <value>100</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels</name> 
  <value>online</value> 
</property>

<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.online.capacity</name> 
  <value>100</value>
</property>

<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.accessible-node-labels.stream.capacity</name> 
  <value>0</value>
</property>

<property> 
  <name>yarn.scheduler.capacity.root.onlineQ.default-node-label-expression</name> 
  <value>online</value> 
</property>

</configuration>

答案 1 :(得分:-1)

请检查您的flink应用日志,看看连接到yarn resourcemanager时是否存在问题。当我用HA在纱线上使用flink时,我也解决了这个问题。我不确定我是否是唯一一个。