groupBy是否在akka-stream中泄露?

时间:2015-11-23 07:08:56

标签: scala akka-stream

我想在akka-stream上编写一个流,用于通过session_uid对来自无限流的事件进行分组,并计算每个会话的流量总和(详见我previous question)。

我将通过session_uid将Source#groupBy函数用于群组事件,但似乎此函数会在内部累积所有群组密钥,并且无法释放它们。这导致java.lang.OutOfMemoryError: Java heap space异常。这是重现它的代码:

import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Flow, Sink, Source}

import scala.util.Random

object GroupByMemoryLeakApplication extends App {
  implicit val system = ActorSystem()
  import system.dispatcher

  implicit val materializer = ActorMaterializer()

  val bigString = Random.nextString(512 * 1024)

  // This is infinite stream of events (i.e. this is session ids)
  val eventsSource = Source(() => (1 to 1000000000).iterator)
    .map((i) => { (i, bigString + i) })

  // This is flow pass event through groupBy function
  val groupByFlow = Flow[(Int, String)]
    .groupBy(_._2)
    .map {
      case (sessionUid, sessionEvents) =>
        sessionEvents
          .map(e => { println(e._1); e })
          .runWith(Sink.head)
    }
    .mapAsync(4)(identity)

  eventsSource
    .via(groupByFlow)
    .runWith(Sink.ignore)
    .onComplete(_ => system.shutdown())
}

那么,在完成相关事件流(sessionUid)的完整处理之后,如何在groupBy内发布分组键(sessionEvents)?

可能是任何人通过akka-stream的session_uid对事件进行分组的另一种方式吗?

0 个答案:

没有答案