IIS回收AppPool

时间:2016-06-07 13:35:34

标签: akka akka-cluster akka.net akka.net-cluster akka-remoting

我们为短信,电子邮件和推送通知创建了Akka Cluster基础架构。系统中存在3种不同类型的节点,即客户端,发送方和灯塔。 Web应用程序和API应用程序正在使用客户端角色(Web和API在IIS中托管)。 Lighthouse和Sender角色作为Windows服务托管。通过考虑Web应用程序和API应用程序AppPools因IIS而回收,在global.asax.cs的Start和Stop事件中,我们关闭了客户端角色中的actor系统并重新开始。我们可以通过日志观察系统是否成功关闭并加入集群。

但有时,当AppPool回收时,客户端ActorSystem启动但无法加入群集,我们的通知将停止工作(这对我们来说是一个巨大的问题)。当我们手动击落ActorSystem并使其再次手动工作时,它会加入群集。这种情况大约每两天发生一次。

我们可以观察到Client在错误之前加入了Cluster;

  

节点[akka.tcp:// NotificationSystem @ :41350]正在加入,角色[客户端]
  领导者正在将节点[akka.tcp:// NotificationSystem @ :41350]移至[Up]

通过查看日志,我们可以在客户端加入群集后看到以下错误;

  

关闭地址:akka.tcp:// NotificationSystem @ :41350Akka.Remote.ShutDownAssociation:关闭地址: akka.tcp:// NotificationSystem @ :41350 ---> Akka.Remote.Transport.InvalidAssociationException:远程系统终止了关联,因为它正在关闭。 ---内部异常堆栈跟踪的结束---在Akka.Remote.EndpointWriter.b__20_0(Exception ex)的Akka.Remote.EndpointWriter.PublishAndThrow(异常原因,LogLevel级别)Akka.Actor.LocalOnlyDecider.Decide(异常原因) )at Akka.ActorForOneStrategy.Handle(IActorRef child,Exception x)at Akka.Actor.SupervisorStrategy.HandleFailure(ActorCell actorCell,Exception cause,ChildRestartStats failedChildStats,IReadOnlyCollection1 allChildren)at Akka.Actor.ActorCell.HandleFailed(Failed f)at at Akka.Actor.ActorCell.SystemInvoke(信封包络)---从抛出异常的先前位置开始的堆栈跟踪---在Akka.Actor.ActorCell.SystemInvoke(信封)的Akka.Actor.ActorCell.HandleFailed(失败f)处信封)Akka.Remote.ShutDownAssociation:关闭地址:akka.tcp:// NotificationSystem @ :41350 ---&gt ; Akka.Remote.Transport.InvalidAssociationException:远程系统终止了关联,因为它正在关闭。 ---内部异常堆栈跟踪的结束---在Akka.Remote.EndpointWriter.b__20_0(Exception ex)的Akka.Remote.EndpointWriter.PublishAndThrow(异常原因,LogLevel级别)Akka.Actor.LocalOnlyDecider.Decide(异常原因) )at Akka.AneForOneStrategy.Handle(IActorRef child,Exception x)at Akka.Actor.SupervisorStrategy.HandleFailure(ActorCell actorCell,Exception cause,ChildRestartStats failedChildStats,IReadOnlyCollection`1 allChildren)at Akka.Actor.ActorCell.HandleFailed(failed f) )at Akka.Actor.ActorCell.SystemInvoke(Envelope envelope)---抛出异常的前一个位置的堆栈跟踪结束---在Akka.Actor.ActorCell.SystemInvoke的Akka.Actor.ActorCell.HandleFailed(Failed f) (信封信封)

发生错误后,我们会看到以下错误消息;

  

与具有UID [226948907]的[akka.tcp:// NotificationSystem @ :41350]的关联无法恢复失败。现在,UID已被隔离,并且此UID的所有消息都将传递给死信。必须重新启动远程actor系统才能从这种情况中恢复。

如果不重新启动客户端角色,系统就无法自行更正。

我们的客户角色配置是;

<akka>
<hocon>
    <![CDATA[
        akka{
            loglevel = DEBUG

            actor{
                provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"

                deployment {
                    /coordinatorRouter {
                        router = round-robin-group
                        routees.paths = ["/user/NotificationCoordinator"]
                        cluster {
                                enabled = on
                                max-nr-of-instances-per-node = 1
                                allow-local-routees = off
                                use-role = sender
                        }
                    }                
                }

                serializers {
                    wire = "Akka.Serialization.WireSerializer, Akka.Serialization.Wire"
                }

                serialization-bindings {
                 "System.Object" = wire
                }

                debug{
                    receive = on
                    autoreceive = on
                    lifecycle = on
                    event-stream = on
                    unhandled = on
                }
            }

            remote {
                helios.tcp {
                        transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                        applied-adapters = []
                        transport-protocol = tcp
                        hostname = "***.***.**.**"
                        port = 0
                }
            }

            cluster {
                    seed-nodes = ["akka.tcp://NotificationSystem@***.***.**.**:5053", "akka.tcp://NotificationSystem@***.***.**.**:5073"]
                    roles = [client]
            }
        }
    ]]>
</hocon>

我们的发件人角色配置是;

  <akka>
<hocon><![CDATA[
            akka{
                loglevel = INFO

                loggers = ["Akka.Logger.NLog.NLogLogger, Akka.Logger.NLog"]

                actor{
                    debug {  
                        # receive = on 
                        # autoreceive = on
                        # lifecycle = on
                        # event-stream = on
                        # unhandled = on
                    }         

                    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"           

                    serializers {
                        wire = "Akka.Serialization.WireSerializer, Akka.Serialization.Wire"
                    }

                    serialization-bindings {
                     "System.Object" = wire
                    }

                    deployment{
                        /NotificationCoordinator/ApplePushNotificationActor{
                            router = round-robin-pool
                            resizer{
                                enabled = on
                                lower-bound = 3
                                upper-bound = 5
                            }
                        }

                        /NotificationCoordinator/AndroidPushNotificationActor{
                            router = round-robin-pool
                            resizer{
                                enabled = on
                                lower-bound = 3
                                upper-bound = 5
                            }
                        }

                        /NotificationCoordinator/EmailActor{
                            router = round-robin-pool
                            resizer{
                                enabled = on
                                lower-bound = 3
                                upper-bound = 5
                            }
                        }

                        /NotificationCoordinator/SmsActor{
                            router = round-robin-pool
                            resizer{
                                enabled = on
                                lower-bound = 3
                                upper-bound = 5
                            }
                        }

                        /NotificationCoordinator/LoggingCoordinator/ResponseLoggerActor{
                            router = round-robin-pool
                            resizer{
                                enabled = on
                                lower-bound = 3
                                upper-bound = 5
                            }
                        }                           
                    }
                }

             remote{                            
                        log-remote-lifecycle-events = DEBUG
                        log-received-messages = on

                        helios.tcp{
                            transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                            applied-adapters = []
                            transport-protocol = tcp
                            #will be populated with a dynamic host-name at runtime if left uncommented
                            #public-hostname = "POPULATE STATIC IP HERE"
                            hostname = "***.***.**.**"
                            port = 0
                    }
                }

                cluster {
                        seed-nodes = ["akka.tcp://NotificationSystem@***.***.**.**:5053", "akka.tcp://NotificationSystem@***.***.**.**:5073"]
                        roles = [sender]
                }
            }
        ]]></hocon>

我们如何解决这个问题?谢谢。

1 个答案:

答案 0 :(得分:2)

这绝对是Akka.Remote中EndpointManager的一个错误。 Akka.NET 1.1--将于6月14日发布,应该解决这个问题。我们已经修复了大量的群集重新加入这些行的错误,但它们尚未发布。 Akka.Cluster将作为该版本的一部分进行RTM。

与此同时,如果您想尝试新位 ,也可以尝试使用Akka.NET Nightly Builds