我们有一个包含3个节点的ignite集群,并且所有服务都使用java瘦客户端连接到该集群。
当服务器节点之一发生故障并且服务尝试连接时,很少有连接成功,并且很少有失败,并引发集群不可用错误。因此我们调试了源代码,发现在ReliableChannel对象构造期间,它选择一个随机节点进行连接,如果该节点不可用,则会引发客户端连接异常。
理想情况下,我们希望它回退到其他节点,因为集群中还有其他节点可用。我们看到上述逻辑是在ReliableChannel类的service方法中实现的。
是否有任何特定原因在对象构造期间不实施回退,而仅在服务方法上使用(任何连接到其他节点的选项)?
还可以控制节点的连接顺序吗?
ReliableChannel代码段
ReliableChannel(
Function<ClientChannelConfiguration, Result<ClientChannel>> chFactory,
ClientConfiguration clientCfg
) throws ClientException {
if (chFactory == null)
throw new NullPointerException("chFactory");
if (clientCfg == null)
throw new NullPointerException("clientCfg");
this.chFactory = chFactory;
this.clientCfg = clientCfg;
List<InetSocketAddress> addrs = parseAddresses(clientCfg.getAddresses());
primary = addrs.get(new Random().nextInt(addrs.size())); // we already verified there is at least one address
ch = chFactory.apply(new ClientChannelConfiguration(clientCfg).setAddress(primary)).get();
for (InetSocketAddress a : addrs)
if (a != primary)
this.backups.add(a);
}
public <T> T service(
ClientOperation op,
Consumer<BinaryOutputStream> payloadWriter,
Function<BinaryInputStream, T> payloadReader
) throws ClientException {
ClientConnectionException failure = null;
T res = null;
int totalSrvs = 1 + backups.size();
svcLock.lock();
try {
for (int i = 0; i < totalSrvs; i++) {
try {
if (failure != null)
changeServer();
if (ch == null)
ch = chFactory.apply(new ClientChannelConfiguration(clientCfg).setAddress(primary)).get();
long id = ch.send(op, payloadWriter);
res = ch.receive(op, id, payloadReader);
failure = null;
break;
}
catch (ClientConnectionException e) {
if (failure == null)
failure = e;
else
failure.addSuppressed(e);
}
}
}
finally {
svcLock.unlock();
}
if (failure != null)
throw failure;
return res;
}