与使用ArrayBlockingQueue的两个线程相比,Akka流速度太慢

时间:2018-05-15 11:43:06

标签: scala akka akka-stream

我正在解决将当前项目切换到Akka流时出现的性能问题。

在简化了问题之后,似乎Akka流传递的信息要少得多,然后就是我所期待的。

这里我有两段非常简单的代码,一次只写10个字节到磁盘上的文件。

第一个使用两个线程和一个连接它们的ArrayBlockingQueue:

val bw = Files.newBufferedWriter(Paths.get("test.txt"))
val target = "0123456789".toCharArray
val abq = new ArrayBlockingQueue[Array[Char]](10000)

new Thread(new Runnable {
  override def run(): Unit = {
    while (true) {
      bw.write(abq.take())
    }
  }
}).start()

while (true) {
  abq.put(target)
}

第二个使用Akka流:

implicit val system: ActorSystem = ActorSystem("TestActorSystem")
implicit val materializer: ActorMaterializer = ActorMaterializer()

// Source & Sink runs in two actors
// Both output of Source & input of Sink were buffered
Source
  .repeat(ByteString("0123456789"))
  .buffer(8192, OverflowStrategy.backpressure)
  .async
  .runWith(
    FileIO
      .toPath(Paths.get("test.txt"))
      .withAttributes(Attributes.inputBuffer(8192, 8192))
  )

我发现第一个以27.4MB / s的速度写入文件,而第二个只在我的测试机器上以3.4MB / s的速度写入文件。 thread-with-arrayBlockingQueue one比Akka one快8倍。

我试图将Sink从FileIO更改为写入BufferedWriter的手写Sink。这让第二个速度增加到5.5MB / s,但仍然比第一个慢5倍。

根据我的理解,Akka流将有更好的表现 比较它到现在。

在这种情况下,我做的事情是否有问题?

1 个答案:

答案 0 :(得分:0)

我知道在这种情况下真正让它变慢的是什么。

我已经将问题中的FileIO接收器换成了一个带有时间计数器的手写器,以便测量接收器中每一步的成本。

新的水槽在这里:

final class FileWriteSink extends GraphStage[SinkShape[Array[Char]]] {

  private val in: Inlet[Array[Char]] = Inlet("ArrayOfCharInlet")

  override def shape: SinkShape[Array[Char]] = SinkShape.of(in)

  override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = {
    new GraphStageLogic(shape) {
      // note that the operations to these vars below are not thread-safe
      // but it is fairly enough to show the time differences in a large scale with a relatively low cost
      private var count = 0L

      private var grabTime = 0L
      private var writeTime = 0L
      private var pullTime = 0L
      private var gapTime = 0L
      private var counterTime = 0L

      private var lastTime = 0L
      private var currTime = System.nanoTime()

      @inline private def timeDiff(): Long = {
        lastTime = currTime
        currTime = System.nanoTime()
        currTime - lastTime
      }

      private val bw = Files.newBufferedWriter(Paths.get("test.xml"))
      setHandler(in, new InHandler {
        override def onPush(): Unit = {
          gapTime += timeDiff()
          count += 1
          if (count % 1000000 == 0) {
            println(s"count: $count, gapTime: $gapTime, counterTime: $counterTime, grabTime: $grabTime, writeTime: $writeTime, pullTime: $pullTime")
            println(s"count: $count, gapTime-avg: ${gapTime / count}, counterTime-avg: ${counterTime / count}, grabTime-avg: ${grabTime / count}, writeTime-avg: ${writeTime / count}, pullTime-avg: ${pullTime / count}")
          }
          counterTime += timeDiff()
          val v = grab(in)
          grabTime += timeDiff()
          bw.write(v)
          writeTime += timeDiff()
          pull(in)
          pullTime += timeDiff()
        }
      })

      override def preStart(): Unit = {
        pull(in)
      }
    }
  }

}

然后我从我的测试环境中得到了这个日志:

count: 1000000, gapTime: 3220562882, counterTime: 273008576, grabTime: 264956553, writeTime: 355040917, pullTime: 260033342
count: 1000000, gapTime-avg: 3220, counterTime-avg: 273, grabTime-avg: 264, writeTime-avg: 355, pullTime-avg: 260
count: 2000000, gapTime: 6307318517, counterTime: 549671865, grabTime: 532654603, writeTime: 708526613, pullTime: 524305026
count: 2000000, gapTime-avg: 3153, counterTime-avg: 274, grabTime-avg: 266, writeTime-avg: 354, pullTime-avg: 262
count: 3000000, gapTime: 9403004835, counterTime: 821901662, grabTime: 797670212, writeTime: 1054416804, pullTime: 786163401
count: 3000000, gapTime-avg: 3134, counterTime-avg: 273, grabTime-avg: 265, writeTime-avg: 351, pullTime-avg: 262

事实证明 pull()和下一个 onPush()调用之间的时间差距是非常缓慢的。

即使缓冲区已满,Sink也不需要等待源生成下一个元素。在我的测试环境中,两个 onPush()调用之间仍有近3μs的时间间隔。

所以我应该期待的是Akka流将具有很好的整体吞吐量。虽然需要仔细了解两个 onPush()调用之间的间隔时间。在设计实际流的结构时进行处理。