当群集出现故障时,我遇到HazelcastClient(Java)问题。 Hazelcast的版本是客户端和集群的最后一个3.8.1
我定期执行以下代码
getMap().executeOnEntries(new MyProcessor<>(), Predicates.equal("field", var));
问题是当群集关闭时,hazelcast抛出的错误只会记录警告,但不会抛出异常:
2017-04-28 18:32:19,905 [WARN] from com.hazelcast.client.connection.ClientConnectionManager in hz.client_0.internal-1 - hz.client_0 [aa-api] [3.8.1] Heartbeat failed to connection : ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=never, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1}
2017-04-28 18:32:20,884 [WARN] from com.hazelcast.client.spi.ClientPartitionService in hz.client_0.internal-3 - hz.client_0 [aa-api] [3.8.1] Error while fetching cluster partition table!
java.util.concurrent.ExecutionException: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=never, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1}
at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolve(ClientInvocationFuture.java:73)
at com.hazelcast.spi.impl.AbstractInvocationFuture$1.run(AbstractInvocationFuture.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at com.hazelcast.util.executor.LoggingScheduledExecutor$LoggingDelegatingFuture.run(LoggingScheduledExecutor.java:128)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92)
Caused by: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=never, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1}
at com.hazelcast.client.spi.impl.ClientInvocationServiceSupport$CleanResourcesTask.notifyException(ClientInvocationServiceSupport.java:229)
at com.hazelcast.client.spi.impl.ClientInvocationServiceSupport$CleanResourcesTask.run(ClientInvocationServiceSupport.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
... 6 common frames omitted
Caused by: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=never, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1}
at com.hazelcast.client.spi.impl.ClusterListenerSupport.heartbeatStopped(ClusterListenerSupport.java:259)
at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl$Heartbeat.fireHeartbeatStopped(ClientConnectionManagerImpl.java:503)
at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl$Heartbeat.run(ClientConnectionManagerImpl.java:462)
... 10 common frames omitted
2017-04-28 18:32:22,904 [WARN] from com.hazelcast.client.connection.nio.ClientConnection in hz.client_0.internal-1 - hz.client_0 [aa-api] [3.8.1] ClientConnection{alive=false, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=2017-04-28 18:32:19.905, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1} lost. Reason: com.hazelcast.spi.exception.TargetDisconnectedException[Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/xxx.xxx.4.125:49688 remote=/xxx.xxx.8.118:5701]}, remoteEndpoint=[xxx.xxx.8.118]:5701, lastReadTime=2017-04-28 18:31:15.445, lastWriteTime=2017-04-28 18:32:14.905, closedTime=never, lastHeartbeatRequested=2017-04-28 18:32:14.905, lastHeartbeatReceived=2017-04-28 18:31:14.905, connected server version=3.8.1}]
如何处理此异常以便我采取行动?
谢谢,
编辑:当连接的节点断开连接时也会出现问题。客户端未连接到另一个节点(AWS Discovery)。
答案 0 :(得分:2)
问题主要在于配置。一些超时和健康检查间隔太高。
Bellow,客户的默认属性:
hazelcast.client.heartbeat.interval = 10000ms
hazelcast.client.heartbeat.timeout = 300000ms
hazelcast.client.invocation.timeout.seconds = 120s
这是我的新价值
hazelcast.client.heartbeat.interval = 2000
hazelcast.client.heartbeat.timeout = 5000
hazelcast.client.invocation.timeout.seconds = 10
另外,我完全改变了我获取地图,主题以及更常见的hazelcast实例的方式。
在实例时
我处理每个异常(主要是扩展RuntimeException),并且我使用它通知每个类,实例现在可用。
try {
hazelcastInstance = HazelcastClient.newHazelcastClient(config);
eventListeners.forEach(HazelcastEventListener::onConnect);
} catch (Throwable e) {
Logger.error(e.getMessage(), e);
return null;
}
在每次使用实例的请求之前
我调用一个验证实例可用性的代码,如果发生错误,我会通过它通知每个类实例已关闭。
public boolean isClientActive() {
if (getInstance() == null) {
return false;
}
try {
getMap("registration").isLocked("a");
} catch (Throwable e) {
hazelcastInstance = null;
eventListeners.forEach(HazelcastEventListener::onDisconnect);
return false;
}
return true;
}
会员离职时收到通知
// add a membership listener on the cluster
// to get notified when a member is removed
hazelcastInstance.getCluster().addMembershipListener(new MembershipListener() {
@Override
public void memberAdded(MembershipEvent membershipEvent) {}
@Override
public void memberRemoved(MembershipEvent membershipEvent) {
if (membershipEvent.getMembers().isEmpty()) {
restartInstance();
}
}
处理我的HazelcastEventListener
每个使用hazelcast的类都会注册一个eventListener
hazelcastManager.addEventListener(new HazelcastEventListener() {
@Override
public void onConnect() {
map = hazelcastManager.getMap(mapName);
}
@Override
public void onDisconnect() {
map = null;
}
});
重新连接hazelcast客户端
当hazelcastInstance为null时,调用getInstance()将尝试重新连接。
<强>问题强>
它避免了许多错误,但还有一些工作要做,以管理并发问题。 实际上,我认为这个解决方案是一种解决方法,因为它不是非常有效,而且主要是关于Hazelcast中缺少功能的补丁。
这就是为什么我不会“接受”这个解决方案。如果有人有更好的解决方案,请告诉我们。