在Java中进行有限尝试后,停止重试连接到YARN资源管理器

时间:2018-09-18 18:35:26

标签: apache-spark client yarn

我正在使用Java中的org.apache.spark.deploy.yarn.Client API将Spark应用程序提交给YARN。

SparkConf sparkConf = new SparkConf();

List<String> submitArgs = new ArrayList<>();

if (StringUtils.hasText(appName)) {
    submitArgs.add("--name");
    submitArgs.add(appName);
    sparkConf.setAppName(appName);
}
submitArgs.add("--jar");
submitArgs.add(appJarPath);

submitArgs.add("--class");
submitArgs.add(appMainClass);
System.setProperty("SPARK_YARN_MODE", "true");

sparkConf.setMaster("yarn")
       .set("spark.submit.deployMode", "cluster")
       .set("spark.yarn.queue",queue);

ClientArguments clientArguments = new ClientArguments(submitArgs.toArray(new String[submitArgs.size()]));
Client client = new Client(clientArguments, sparkConf);
client.run();

我正在尝试创建一个客户端无法连接到资源管理器的方案,如果尝试这样做,它将超时并抛出异常。但是,客户端将继续尝试连接到YARN资源管理器,而不会发生任何超时。参见下面的控制台输出:

2018-09-18 14:20:25.046  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.yarn.client.RMProxy    : Connecting to ResourceManager at /0.0.0.0:8032
2018-09-18 14:20:27.600  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:29.603  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:31.600  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:33.607  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:35.608  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:37.619  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:39.620  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:41.623  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:43.633  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:45.632  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:18.665  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:20.676  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:22.679  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:24.686  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:26.691  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:28.695  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:30.697  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:32.708  INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client             : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...

我应该执行哪些自定义配置,以及如何在有限的尝试或时间后停止重试?

0 个答案:

没有答案