在尝试使用我的群集诊断不同的问题时,我尝试隔离我的环境以强制选举事件。虽然我的应用程序无法以此异常启动,但是在单独启动节点时:
Caused by: java.util.concurrent.TimeoutException: null
at org.neo4j.cluster.statemachine.StateMachineProxyFactory$ResponseFuture.get(StateMachineProxyFactory.java:300) ~[neo4j-cluster-2.0.1.jar:2.0.1]
at org.neo4j.cluster.client.ClusterJoin.joinByConfig(ClusterJoin.java:158) ~[neo4j-cluster-2.0.1.jar:2.0.1]
at org.neo4j.cluster.client.ClusterJoin.start(ClusterJoin.java:91) ~[neo4j-cluster-2.0.1.jar:2.0.1]
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:503) ~[neo4j-kernel-2.0.1.jar:2.0.1]
... 59 common frames omitted
我的配置设置为60秒加入超时(ha.cluster_join_timeout
),以便各个节点可以初始化群集(ha.allow_init_cluster
)。
查看来自ClusterJoin
类的截断的代码块我相信在一些否定的情况下,代码将循环尝试再次连接,或者当前节点将创建新的集群。
private void joinByConfig() throws TimeoutException
{
while( true )
{
if (config.getClusterJoinTimeout() > 0)
{
try
{
console.log( "Joined cluster:" + clusterConfig.get(config.getClusterJoinTimeout(), TimeUnit.MILLISECONDS ));
return;
}
catch ( InterruptedException e )
{
console.log( "Could not join cluster, interrupted. Retrying..." );
}
catch ( ExecutionException e )
{
logger.debug( "Could not join cluster " + this.config.getClusterName() );
if ( e.getCause() instanceof IllegalStateException )
{
throw ((IllegalStateException) e.getCause());
}
if ( config.isAllowedToCreateCluster() )
{
// Failed to join cluster, create new one
console.log( "Could not join cluster of " + hosts.toString() );
console.log( format( "Creating new cluster with name [%s]...", config.getClusterName() ) );
cluster.create( config.getClusterName() );
break;
}
console.log( "Could not join cluster, timed out. Retrying..." );
}
}
但是TimeoutException
不是这些情况之一,实际上joinByConfig方法也会抛出TimeoutException。当等待时间并且没有收到状态机消息时,StateMachineProxyFactory$ResponseFuture
类(实现Future)会抛出TimooutException
。
public synchronized Object get( long timeout, TimeUnit unit )
throws InterruptedException, ExecutionException, TimeoutException
{
if ( response != null )
{
getResult();
}
this.wait( unit.toMillis( timeout ) );
if ( response == null )
{
throw new TimeoutException();
}
return getResult();
}
是否应该在加入群集时超时,并且如果配置为初始化群集,则不应传播TimoutException并且应初始化新群集?如果不是这样,那么集群服务器是否必须一致启动?