Akka Stream Source.queue的背压策略不起作用

时间:2016-12-27 19:00:56

标签: scala akka akka-stream reactive-streams

我试图理解为什么下面的代码片段正在做它正在做的事情。我原本以为,因为Sink不能比Source生成内容更快地产生需求,所以我会得到丢弃的消息以响应一些提议(溢出策略设置为Drop Buffer)以及错误和队列关闭消息在自毁之后。

摘录:

package playground

import java.time.LocalDateTime
import java.util.concurrent.atomic.AtomicInteger

import akka.actor.{Actor, ActorLogging, ActorSystem, Props}
import akka.stream.QueueOfferResult.{Dropped, Enqueued, Failure, QueueClosed}
import akka.stream._
import akka.stream.scaladsl.{Sink, Source}

import scala.concurrent.duration._

case object MessageToSink

object Playground extends App {

  implicit val actorSystem = ActorSystem("Playground")
  implicit val execCntxt = actorSystem.dispatcher

  val sinkActor = actorSystem.actorOf(Props[Actor2SinkFwder])
  actorSystem.scheduler.schedule(1 millisecond, 50 milliseconds, sinkActor, MessageToSink)

  println(s"Playground has started... ${LocalDateTime.now()}")
}

class Actor2SinkFwder extends Actor with ActorLogging {

  implicit val materializer = ActorMaterializer()
  implicit val execCtxt = context.dispatcher

  val flow = Source.queue[Int](bufferSize = 1, overflowStrategy = OverflowStrategy.dropBuffer)
    .to(Sink.foreach[Int] {
      i =>
        println(s"$i Sinking starts at ${LocalDateTime.now()}")
        Thread.sleep(150)
        if (i == 5) throw new RuntimeException("KaBoom!")
        println(s"$i Sinking completes at ${LocalDateTime.now()}")
    }).run()

  val i: AtomicInteger = new AtomicInteger(0)

  override def receive: Receive = {
    case MessageToSink =>
      val num = i.incrementAndGet()
      println(s"$num Sink Command received at ${LocalDateTime.now()}")
      flow.offer(num).collect {
        case Enqueued => println(s"$num Enqueued ${LocalDateTime.now}")
        case Dropped => println(s"$num Dropped ${LocalDateTime.now}")
        case Failure(err) => println(s"$num Failed ${LocalDateTime.now} $err")
        case QueueClosed => println(s"$num Failed ${LocalDateTime.now} QueueClosed")
      }
   }
}

输出:

Playground has started... 2016-12-27T18:35:29.574
1 Sink Command received at 2016-12-27T18:35:29.640
2 Sink Command received at 2016-12-27T18:35:29.642
3 Sink Command received at 2016-12-27T18:35:29.642
1 Sinking starts at 2016-12-27T18:35:29.649
1 Enqueued 2016-12-27T18:35:29.650
4 Sink Command received at 2016-12-27T18:35:29.688
5 Sink Command received at 2016-12-27T18:35:29.738
6 Sink Command received at 2016-12-27T18:35:29.788
1 Sinking completes at 2016-12-27T18:35:29.799
2 Sinking starts at 2016-12-27T18:35:29.800
2 Enqueued 2016-12-27T18:35:29.800
7 Sink Command received at 2016-12-27T18:35:29.838
8 Sink Command received at 2016-12-27T18:35:29.888
9 Sink Command received at 2016-12-27T18:35:29.938
2 Sinking completes at 2016-12-27T18:35:29.950
3 Sinking starts at 2016-12-27T18:35:29.951
3 Enqueued 2016-12-27T18:35:29.951
10 Sink Command received at 2016-12-27T18:35:29.988
11 Sink Command received at 2016-12-27T18:35:30.038
12 Sink Command received at 2016-12-27T18:35:30.088
3 Sinking completes at 2016-12-27T18:35:30.101
4 Sinking starts at 2016-12-27T18:35:30.101
4 Enqueued 2016-12-27T18:35:30.101
13 Sink Command received at 2016-12-27T18:35:30.138
14 Sink Command received at 2016-12-27T18:35:30.189
15 Sink Command received at 2016-12-27T18:35:30.238
4 Sinking completes at 2016-12-27T18:35:30.251
5 Sinking starts at 2016-12-27T18:35:30.251
5 Enqueued 2016-12-27T18:35:30.252
16 Sink Command received at 2016-12-27T18:35:30.288
17 Sink Command received at 2016-12-27T18:35:30.338
18 Sink Command received at 2016-12-27T18:35:30.388
19 Sink Command received at 2016-12-27T18:35:30.438
20 Sink Command received at 2016-12-27T18:35:30.488
21 Sink Command received at 2016-12-27T18:35:30.538
22 Sink Command received at 2016-12-27T18:35:30.588
23 Sink Command received at 2016-12-27T18:35:30.638
24 Sink Command received at 2016-12-27T18:35:30.688
25 Sink Command received at 2016-12-27T18:35:30.738
26 Sink Command received at 2016-12-27T18:35:30.788
etc...

我认为我的错误理解是在QueueSource类中使用getAsyncCallback。即使QueueSource中的商品调用使用正确的商品详细信息调用stageLogic,阶段逻辑中此代码的实际处理程序也不会被调用,直到前一个元素完成处理,因此没有用于检查缓冲区大小或应用溢出的逻辑策略正在应用......: - /

2 个答案:

答案 0 :(得分:6)

要查看您期望的结果,您应在asyncSource之间添加Sink阶段。这是告诉Akka使用两个不同的Actors运行两个阶段的方法 - 强制两者之间的异步边界。

如果没有async,Akka将通过粉碎一个actor中的所有内容来优化执行,这将使处理顺序化。在您的示例中,正如您所注意到的,消息是offer到队列,直到前一条消息的Thread.sleep(150)完成。 有关该主题的更多信息可以在here找到。

  val flow = Source.queue[Int](bufferSize = 1, overflowStrategy = OverflowStrategy.dropBuffer)
    .async
    .to(Sink.foreach[Int] {...}).run()

此外,在匹配.offer结果时,您应该再添加一个案例。这是Failure的{​​{1}},Future在队列下游失败时完成。这适用于前5个

之后的所有邮件Future
offer

请注意,即使完成上述所有操作,您也不会看到任何 override def receive: Receive = { case MessageToSink => val num = i.incrementAndGet() println(s"$num Sink Command received at ${LocalDateTime.now()}") flow.offer(num).onComplete { case Success(Enqueued) => println(s"$num Enqueued ${LocalDateTime.now}") case Success(Dropped) => println(s"$num Dropped ${LocalDateTime.now}") case Success(Failure(err)) => println(s"$num Failed ${LocalDateTime.now} $err") case Success(QueueClosed) => println(s"$num Failed ${LocalDateTime.now} QueueClosed") case util.Failure(err) => println(s"$num Failed ${LocalDateTime.now} with exception $err") } } 结果。那是因为您选择了QueueOfferResult.Dropped策略。每个传入的消息都将排队(因此产生DropBuffer消息),踢出现有的缓冲区。如果您将策略更改为Enqueued,则应该开始看到一些DropNew消息。

答案 1 :(得分:0)

我已经找到了我在评论中写的问题的答案,我认为这与原始问题非常相关,所以我想将其添加为答案(但正确答案是斯特凡诺的答案)。

导致此行为的元素是缓冲区,但不是我们已明确配置的缓冲区,例如map.(...).buffer(1,OverflowStrategy.dropBuffer).async,而是基于实现构建的内部缓冲区。此缓冲区专门用于提高性能,是实现的蓝图优化的一部分。

  

虽然流水线操作通常会增加吞吐量,但实际上需要将元素传递给异步(因此穿过线程)边界,这是很重要的。 为了分摊这笔费用,Akka Streams在内部采用了窗口式批量背压策略。它是窗口化的,因为与停止 - 等待协议相反,多个元素可能与元素请求同时“在飞行中”。它也是批处理的,因为一旦元素从窗口缓冲区中排出,就不会立即请求新元素,但是在多个元素耗尽后请求多个元素。这种批处理策略降低了通过异步边界传播背压信号的通信成本

关于internal buffers的文档是否接近explicit buffers并且属于"working with rate"部分,并非偶然。

BatchingActorInputBoundary有inputBuffer

  /* Bad: same number of emitted and consumed events, i.e. DOES NOT DROP
  Emitted: 1
  Emitted: 1
  Emitted: 1
  Consumed: 1
  Emitted: 1
  Emitted: 1
  Consumed: 1
  Consumed: 1
  Consumed: 1
  Consumed: 1
  */
  def example1() {
    val c = Source.tick(500 millis, 500 millis, 1)
      .map(x => {
        println("Emitted: " + x)
        x
      })
      .buffer(1, OverflowStrategy.dropBuffer).async
      .toMat(Sink.foreach[Int](x => {
        Thread.sleep(5000)
        println("Consumed: " + x)
      }))(Keep.left)
      .run
    Thread.sleep(3000)
    c.cancel()

}

上面导致意外(对我来说)行为的例子可以解决"#34;用

减小内部缓冲区的大小
.toMat(Sink.foreach[Int](x => {
            Thread.sleep(5000)
            println("Consumed: " + x)
          }))
          (Keep.left).addAttributes(Attributes.inputBuffer(initial = 1, max = 1))

现在,上游的一些元素被丢弃但是有一个大小为1的最小输入缓冲区,我们得到以下输出:

Emitted: 1
Emitted: 1
Emitted: 1
Emitted: 1
Emitted: 1
Consumed: 1
Consumed: 1
Consumed: 1

我希望这个答案能为斯特凡诺的答案增添价值。

akka团队总是领先一步

  

通常,当时间或速率驱动的处理阶段表现出奇怪的行为时,首先要尝试的解决方案之一应该是将受影响元素的输入缓冲区减少为1.

**更新:**

Konrad Malawski认为这是一个 racy解决方案,并建议我将此行为实现为GraphStage。在这里。

class LastElement[A] extends GraphStage[FlowShape[A,A]] {
    private val in = Inlet[A]("last-in")
    private val out = Outlet[A]("last-out")

    override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
      var pushPending: Option[A] = None

      override def preStart(): Unit = pull(in)

      def pushIfAvailable() = if (isAvailable(out)) {
        pushPending.foreach(p => {
          push(out, p)
          pushPending = None
        })
      }

      setHandler(out, new OutHandler {
        override def onPull(): Unit = pushIfAvailable
      })

      setHandler(in,new InHandler {
        override def onPush(): Unit = {
          pushPending = Some(grab(in))
          pushIfAvailable
          pull(in)
        }
      })

    }

    override def shape: FlowShape[A, A] = FlowShape(in,out)
  }