具有远程节点的Akka(.net)集群:Disassociated exception

时间:2015-09-02 11:19:46

标签: akka akka-cluster akka.net

使用akka(.net)我正在尝试实现简单的集群用例。

  1. 群集 - 用于节点上/下事件。
  2. 远程 - 用于向特定节点发送消息。
  3. 有两个参与者:主节点监听集群事件和从节点连接到集群。

    Address address = new Address("akka.tcp", "ClusterSystem", "master", 8080);
    cluster.Join(address);
    

    重新创建ClusterEvent.MemberUp消息时,主节点创建actor链接:

    ClusterEvent.MemberUp up = message as ClusterEvent.MemberUp;
    ActorSelection nodeActor = system.ActorSelection(up.Member.Address + "/user/slave_0");
    

    向此actor发送消息会导致错误:

    与远程系统关联akka.tcp:// ClusterSystem @ slave:8090失败;地址现在被门控5000毫秒。原因是:[Disocociated]

    主配置:

        akka {
            actor {
                provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster""
            }
    
            remote {
                helios.tcp {
                    port = 8080
                    hostname = master
                    bind-hostname = master
                    bind-port = 8080
                    send-buffer-size = 512000b
                    receive-buffer-size = 512000b
                    maximum-frame-size = 1024000b
                    tcp-keepalive = on
                }
            }
            cluster{
                failure-detector {
                    heartbeat - interval = 10 s
                }
                auto-down-unreachable-after = 10s
                gossip-interval = 5s
            }
            stdout-loglevel = DEBUG
            loglevel = DEBUG
    
            debug {{  
                receive = on 
                autoreceive = on
                lifecycle = on
                event-stream = on
                unhandled = on
            }}
        }
    

    slave config:

    akka {
            actor {
                provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster""
            }
    
        remote {
            helios.tcp {
                port = 8090
                hostname = slave
                bind-hostname = slave
                bind-port = 8090
                send-buffer-size = 512000b
                receive-buffer-size = 512000b
                maximum-frame-size = 1024000b
                tcp-keepalive = on
            }
        }
        cluster{
            failure-detector {
                heartbeat - interval = 10 s
            }
            auto-down-unreachable-after = 10s
            gossip-interval = 5s
        }
        stdout-loglevel = DEBUG
        loglevel = DEBUG
    
        debug {{  
            receive = on 
            autoreceive = on
            lifecycle = on
            event-stream = on
            unhandled = on
        }}
    
    }
    

1 个答案:

答案 0 :(得分:2)

这是你的问题:

cluster{
            failure-detector {
                heartbeat - interval = 10 s
            }
            auto-down-unreachable-after = 10s
            gossip-interval = 5s
        }

heartbeat-interval和auto-down-unreachable-after持续时间相同 - 因此你的节点几乎总是会在10秒后自动解除关联,因为你正在押注故障检测器可能丢失的竞争条件。

auto-down-unreachable-after是危险设置 - 请勿使用它。你最终会出现脑裂或更糟的情况。

确保您的故障检测器间隔始终低于自动停机间隔。