我现有一个启用了Yarn的Hadoop集群,并且已使用客户端模式创建了独立的Java Spark应用程序。群集启动,一切开始初始化。但是,当Yarn应用程序母版出现在Hadoop节点管理器上时,我的应用程序遇到以下异常:
public class SafeListMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<SafeListMiddleware> _logger;
private readonly string _adminSafeList;
public SafeListMiddleware(
RequestDelegate next,
ILogger<SafeListMiddleware> logger,
string adminSafeList)
{
_adminSafeList = adminSafeList;
_next = next;
_logger = logger;
}
public async Task Invoke(HttpContext context)
{
if (context.Request.Method != "GET")
{
var remoteIp = context.Connection.RemoteIpAddress;
string[] ip = _adminSafeList.Split(';');
var bytes = remoteIp.GetAddressBytes();
var match = false;
foreach (var address in ip)
{
var testIp = IPAddress.Parse(address);
var rangeA = IPAddressRange.Parse(address);
if(rangeA.Contains(remoteIp))
{
match = true;
break;
}
}
}
await _next.Invoke(context);
}
}
}
它似乎正在尝试重新连接到驱动程序,但无法连接。 19/02/05 18:23:45 INFO yarn.ApplicationMaster: Driver now available: mydriver.host.com:32943
19/02/05 18:23:45 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Invalid Spark URL: spark://YarnScheduler@mydriver.host.com:32943
at org.apache.spark.rpc.RpcEndpointAddress$.apply(RpcEndpointAddress.scala:66)
at org.apache.spark.rpc.netty.NettyRpcEnv.asyncSetupEndpointRefByURI(NettyRpcEnv.scala:134)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.createSchedulerRef(ApplicationMaster.scala:484)
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:677)
代表我在群集中的docker容器,可从其他容器访问。我无法在网上的任何地方找到任何东西,这使人无法理解为何Spark将在这里抛出此特定异常。