如何恢复在处理自身发送的消息时失败的Akka演员?

时间:2019-07-10 22:28:07

标签: scala akka akka-actor

我有以下演员代码示例。

object SomeExternalDep {
  private var flag = true

  // this function will throw an exception once when called with the value 3, then it won't throw another exception
  @throws[RuntimeException]
  def potentiallyThrows(curr: Int): Unit = {
    if (curr == 3 && flag) {
      flag = false
      throw new RuntimeException("Something went wrong in external dependency")
    }
  }
}

class CountingActor(start: Int, end: Int)
  extends Actor
    with ActorLogging {

  var curr: Int = start

  // This method counts for us
  private def doCount(): Unit = {
    // This may throw, which will cause this actor to fail
    SomeExternalDep.potentiallyThrows(curr)

    // Send self a count message. If the above call exceptions this is never called
    if (curr <= end) {
      self ! CountingActor.Count(curr)
    }
  }

  override def receive: Receive = {
    case CountingActor.Start => doCount()
    case CountingActor.Count(n) =>
      log.info(s"Counting: $n")
      curr += 1
      doCount()

    case x => log.error(s"bad message $x")
  }

  override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
    log.error(s"CountingActor failed while processing $message")
    self ! CountingActor.Start
  }
}

object CountingActor {
  def props(start: Int, end: Int): Props = Props(new CountingActor(start, end))

  case object Start
  case class Count(n: Int)
}

class SupervisorActor
  extends Actor
    with ActorLogging {

  var countingActor: ActorRef = _

  override val supervisorStrategy: OneForOneStrategy =
    OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1.minute) {
      // case _: RuntimeException => Restart
      case _: RuntimeException => Resume
    }

  private def doStart(): Unit = {
    countingActor = context.actorOf(CountingActor.props(0, 5))

    countingActor ! CountingActor.Start
  }

  override def receive: Receive = {
    case SupervisorActor.Init => doStart()
    case _ => log.error(s"Supervisor doesn't process messages")
  }

}

这里,CountingActor基本上向自己发送一条Count消息。然后,它将调用某些可能会失败的外部依赖项。计数时,它也会对其内部状态进行一些更改。我还实现了一个简单的SupervisorActor。该演员将CountingActor创建为其子代。

当监管策略设置为Restart时。我得到了预期的结果。参与者计数到3,失败,因为看到异常。 preRestart挂钩向收件箱发送一条新的Start消息,并且它再次开始计数。

[INFO] [07/10/2019 15:23:59.895] [counting-sys-akka.actor.default-dispatcher-2] [akka://counting-sys/user/$a/$a] Counting: 0
[INFO] [07/10/2019 15:23:59.896] [counting-sys-akka.actor.default-dispatcher-2] [akka://counting-sys/user/$a/$a] Counting: 1
[INFO] [07/10/2019 15:23:59.896] [counting-sys-akka.actor.default-dispatcher-2] [akka://counting-sys/user/$a/$a] Counting: 2
[ERROR] [07/10/2019 15:23:59.905] [counting-sys-akka.actor.default-dispatcher-2] [akka://counting-sys/user/$a/$a] Something went wrong in external dependency
java.lang.RuntimeException: Something went wrong in external dependency
    at SomeExternalDep$.potentiallyThrows(ActorSupervisionTest.scala:15)
    at CountingActor.CountingActor$$doCount(ActorSupervisionTest.scala:30)

<Stack Trace omitted>

[ERROR] [07/10/2019 15:23:59.909] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] CountingActor failed while processing Some(Count(2))
[INFO] [07/10/2019 15:23:59.912] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 0
[INFO] [07/10/2019 15:23:59.912] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 1
[INFO] [07/10/2019 15:23:59.912] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 2
[INFO] [07/10/2019 15:23:59.912] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 3
[INFO] [07/10/2019 15:23:59.913] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 4
[INFO] [07/10/2019 15:23:59.913] [counting-sys-akka.actor.default-dispatcher-3] [akka://counting-sys/user/$a/$a] Counting: 5

但是当我将监管策略更改为Resume时。演员因无法发送下一条Count消息而失败而陷入困境。

[INFO] [07/10/2019 15:26:01.779] [counting-sys-akka.actor.default-dispatcher-5] [akka://counting-sys/user/$a/$a] Counting: 0
[INFO] [07/10/2019 15:26:01.780] [counting-sys-akka.actor.default-dispatcher-5] [akka://counting-sys/user/$a/$a] Counting: 1
[INFO] [07/10/2019 15:26:01.780] [counting-sys-akka.actor.default-dispatcher-5] [akka://counting-sys/user/$a/$a] Counting: 2
[WARN] [07/10/2019 15:26:01.786] [counting-sys-akka.actor.default-dispatcher-4] [akka://counting-sys/user/$a/$a] Something went wrong in external dependency

如何解决此问题,以便在外部依赖项失败时可以从3开始计数?

1 个答案:

答案 0 :(得分:2)

看起来实际上是在启动代码。您的逻辑基本上是一个从1到N的循环,其中在每次迭代中,您都发送一条消息以进行下一次迭代,问题是,如果引发异常,则不会t将消息发送到下一个迭代,这是主管执行工作的地方,重新启动很简单,因为执行了再次启动循环的代码,但是如果继续执行流程,则去到下一个迭代的消息是从未发送过。

一个简单的解决方法是通过首先向自身发送消息,然后处理危险的操作来更改doCount方法上的操作顺序,这应该适用于Resume策略,但是我在实际使用此方法之前会测试一些方案,我不知道的是akka是否会在采用Restart策略的情况下丢弃邮箱,我相信不会这样做,这意味着在重新启动actor之后,它将获取待处理的消息。

另一种解决方法是在恢复儿童演员后重新发送来自主管的消息。

编辑:我对akka源代码进行了一些调查,没有一种明显的方法来捕获简历事件,实际上有一个内部Resume事件,但它对客气并且没有发送给您的实际演员,我认为如果您喜欢使用Resume策略,请不要打扰主管,而只是在演员内部捕获可能的异常(基本上是模仿简历策略) ,这应该为您提供预期的行为,而不是处理可能的极端情况。