使用HA

时间:2016-07-03 10:14:06

标签: apache-spark yarn

我有一个启用HA的YARN群集,其中包含两个资源管理器。 问题是spark总是试图连接到第一个资源管理器,即使它处于待机模式。

纱线版本​​为2.6& Spark Version是1.4.1

纱-site.xml中

<property>
  <name>yarn.resourcemanager.address</name>
  <value>hadoop-0:8050</value>
</property>

<property>
  <name>yarn.resourcemanager.admin.address</name>
  <value>hadoop-0:8141</value>
</property>

<property>
  <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>
  <value>/yarn-leader-election</value>
</property>

<property>
  <name>yarn.resourcemanager.ha.enabled</name>
  <value>true</value>
</property>

<property>
  <name>yarn.resourcemanager.ha.rm-ids</name>
  <value>rm1,rm2</value>
</property>

<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>hadoop-0</value>
</property>

<property>
  <name>yarn.resourcemanager.hostname.rm1</name>
  <value>hadoop-0</value>
</property>

<property>
  <name>yarn.resourcemanager.hostname.rm2</name>
  <value>hadoop-4</value>
</property>

日志:

16/07/03 05:44:56 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:44:57 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:44:58 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:44:59 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:00 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:01 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:02 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:03 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:04 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:05 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:06 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:07 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:08 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:09 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:10 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:10 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/07/03 05:45:11 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 15 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:12 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 16 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:13 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 17 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:14 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 18 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:15 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 19 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:16 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 20 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:17 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 21 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:18 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 22 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:19 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 23 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:20 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 24 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:21 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 25 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:22 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 26 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:23 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 27 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:24 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 28 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:25 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 29 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:25 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/07/03 05:45:26 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 30 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:27 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 31 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:28 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 32 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:29 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 33 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:30 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 34 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:31 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 35 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:32 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 36 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:33 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 37 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:34 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 38 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:35 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 39 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:36 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 40 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:37 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 41 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:38 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 42 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:39 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 43 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
16/07/03 05:45:40 INFO ipc.Client: Retrying connect to server: hadoop-0/10.240.0.15:8030. Already tried 44 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)

1 个答案:

答案 0 :(得分:0)

由于纱线客户端故障转移机制,客户端将通过循环模式连接RM。