我正在解决将当前项目切换到Akka流时出现的性能问题。
在简化了问题之后,似乎Akka流传递的信息要少得多,然后就是我所期待的。
这里我有两段非常简单的代码,一次只写10个字节到磁盘上的文件。
第一个使用两个线程和一个连接它们的ArrayBlockingQueue:
val bw = Files.newBufferedWriter(Paths.get("test.txt"))
val target = "0123456789".toCharArray
val abq = new ArrayBlockingQueue[Array[Char]](10000)
new Thread(new Runnable {
override def run(): Unit = {
while (true) {
bw.write(abq.take())
}
}
}).start()
while (true) {
abq.put(target)
}
第二个使用Akka流:
implicit val system: ActorSystem = ActorSystem("TestActorSystem")
implicit val materializer: ActorMaterializer = ActorMaterializer()
// Source & Sink runs in two actors
// Both output of Source & input of Sink were buffered
Source
.repeat(ByteString("0123456789"))
.buffer(8192, OverflowStrategy.backpressure)
.async
.runWith(
FileIO
.toPath(Paths.get("test.txt"))
.withAttributes(Attributes.inputBuffer(8192, 8192))
)
我发现第一个以27.4MB / s的速度写入文件,而第二个只在我的测试机器上以3.4MB / s的速度写入文件。 thread-with-arrayBlockingQueue one比Akka one快8倍。
我试图将Sink从FileIO更改为写入BufferedWriter的手写Sink。这让第二个速度增加到5.5MB / s,但仍然比第一个慢5倍。
根据我的理解,Akka流将有更好的表现 比较它到现在。
在这种情况下,我做的事情是否有问题?
答案 0 :(得分:0)
我知道在这种情况下真正让它变慢的是什么。
我已经将问题中的FileIO接收器换成了一个带有时间计数器的手写器,以便测量接收器中每一步的成本。
新的水槽在这里:
final class FileWriteSink extends GraphStage[SinkShape[Array[Char]]] {
private val in: Inlet[Array[Char]] = Inlet("ArrayOfCharInlet")
override def shape: SinkShape[Array[Char]] = SinkShape.of(in)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = {
new GraphStageLogic(shape) {
// note that the operations to these vars below are not thread-safe
// but it is fairly enough to show the time differences in a large scale with a relatively low cost
private var count = 0L
private var grabTime = 0L
private var writeTime = 0L
private var pullTime = 0L
private var gapTime = 0L
private var counterTime = 0L
private var lastTime = 0L
private var currTime = System.nanoTime()
@inline private def timeDiff(): Long = {
lastTime = currTime
currTime = System.nanoTime()
currTime - lastTime
}
private val bw = Files.newBufferedWriter(Paths.get("test.xml"))
setHandler(in, new InHandler {
override def onPush(): Unit = {
gapTime += timeDiff()
count += 1
if (count % 1000000 == 0) {
println(s"count: $count, gapTime: $gapTime, counterTime: $counterTime, grabTime: $grabTime, writeTime: $writeTime, pullTime: $pullTime")
println(s"count: $count, gapTime-avg: ${gapTime / count}, counterTime-avg: ${counterTime / count}, grabTime-avg: ${grabTime / count}, writeTime-avg: ${writeTime / count}, pullTime-avg: ${pullTime / count}")
}
counterTime += timeDiff()
val v = grab(in)
grabTime += timeDiff()
bw.write(v)
writeTime += timeDiff()
pull(in)
pullTime += timeDiff()
}
})
override def preStart(): Unit = {
pull(in)
}
}
}
}
然后我从我的测试环境中得到了这个日志:
count: 1000000, gapTime: 3220562882, counterTime: 273008576, grabTime: 264956553, writeTime: 355040917, pullTime: 260033342
count: 1000000, gapTime-avg: 3220, counterTime-avg: 273, grabTime-avg: 264, writeTime-avg: 355, pullTime-avg: 260
count: 2000000, gapTime: 6307318517, counterTime: 549671865, grabTime: 532654603, writeTime: 708526613, pullTime: 524305026
count: 2000000, gapTime-avg: 3153, counterTime-avg: 274, grabTime-avg: 266, writeTime-avg: 354, pullTime-avg: 262
count: 3000000, gapTime: 9403004835, counterTime: 821901662, grabTime: 797670212, writeTime: 1054416804, pullTime: 786163401
count: 3000000, gapTime-avg: 3134, counterTime-avg: 273, grabTime-avg: 265, writeTime-avg: 351, pullTime-avg: 262
事实证明 pull()和下一个 onPush()调用之间的时间差距是非常缓慢的。
即使缓冲区已满,Sink也不需要等待源生成下一个元素。在我的测试环境中,两个 onPush()调用之间仍有近3μs的时间间隔。
所以我应该期待的是Akka流将具有很好的整体吞吐量。虽然需要仔细了解两个 onPush()调用之间的间隔时间。在设计实际流的结构时进行处理。