使用akka(.net)我正在尝试实现简单的集群用例。
有两个参与者:主节点监听集群事件和从节点连接到集群。
Address address = new Address("akka.tcp", "ClusterSystem", "master", 8080);
cluster.Join(address);
重新创建ClusterEvent.MemberUp消息时,主节点创建actor链接:
ClusterEvent.MemberUp up = message as ClusterEvent.MemberUp;
ActorSelection nodeActor = system.ActorSelection(up.Member.Address + "/user/slave_0");
向此actor发送消息会导致错误:
与远程系统关联akka.tcp:// ClusterSystem @ slave:8090失败;地址现在被门控5000毫秒。原因是:[Disocociated]
主配置:
akka {
actor {
provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster""
}
remote {
helios.tcp {
port = 8080
hostname = master
bind-hostname = master
bind-port = 8080
send-buffer-size = 512000b
receive-buffer-size = 512000b
maximum-frame-size = 1024000b
tcp-keepalive = on
}
}
cluster{
failure-detector {
heartbeat - interval = 10 s
}
auto-down-unreachable-after = 10s
gossip-interval = 5s
}
stdout-loglevel = DEBUG
loglevel = DEBUG
debug {{
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
}}
}
slave config:
akka {
actor {
provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster""
}
remote {
helios.tcp {
port = 8090
hostname = slave
bind-hostname = slave
bind-port = 8090
send-buffer-size = 512000b
receive-buffer-size = 512000b
maximum-frame-size = 1024000b
tcp-keepalive = on
}
}
cluster{
failure-detector {
heartbeat - interval = 10 s
}
auto-down-unreachable-after = 10s
gossip-interval = 5s
}
stdout-loglevel = DEBUG
loglevel = DEBUG
debug {{
receive = on
autoreceive = on
lifecycle = on
event-stream = on
unhandled = on
}}
}
答案 0 :(得分:2)
这是你的问题:
cluster{
failure-detector {
heartbeat - interval = 10 s
}
auto-down-unreachable-after = 10s
gossip-interval = 5s
}
heartbeat-interval和auto-down-unreachable-after持续时间相同 - 因此你的节点几乎总是会在10秒后自动解除关联,因为你正在押注故障检测器可能丢失的竞争条件。
auto-down-unreachable-after是危险设置 - 请勿使用它。你最终会出现脑裂或更糟的情况。
确保您的故障检测器间隔始终低于自动停机间隔。