我正在使用Java中的org.apache.spark.deploy.yarn.Client API将Spark应用程序提交给YARN。
SparkConf sparkConf = new SparkConf();
List<String> submitArgs = new ArrayList<>();
if (StringUtils.hasText(appName)) {
submitArgs.add("--name");
submitArgs.add(appName);
sparkConf.setAppName(appName);
}
submitArgs.add("--jar");
submitArgs.add(appJarPath);
submitArgs.add("--class");
submitArgs.add(appMainClass);
System.setProperty("SPARK_YARN_MODE", "true");
sparkConf.setMaster("yarn")
.set("spark.submit.deployMode", "cluster")
.set("spark.yarn.queue",queue);
ClientArguments clientArguments = new ClientArguments(submitArgs.toArray(new String[submitArgs.size()]));
Client client = new Client(clientArguments, sparkConf);
client.run();
我正在尝试创建一个客户端无法连接到资源管理器的方案,如果尝试这样做,它将超时并抛出异常。但是,客户端将继续尝试连接到YARN资源管理器,而不会发生任何超时。参见下面的控制台输出:
2018-09-18 14:20:25.046 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.yarn.client.RMProxy : Connecting to ResourceManager at /0.0.0.0:8032
2018-09-18 14:20:27.600 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:29.603 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:31.600 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:33.607 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:35.608 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:37.619 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:39.620 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:41.623 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:43.633 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:20:45.632 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:18.665 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:20.676 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:22.679 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:24.686 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:26.691 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:28.695 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:30.697 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-18 14:21:32.708 INFO 10480 --- [nio-8080-exec-1] org.apache.hadoop.ipc.Client : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...
我应该执行哪些自定义配置,以及如何在有限的尝试或时间后停止重试?