我有一个
的流这是简化代码
// flow to split group of lines into lines
val splitLines = Flow[List[Evt]].mapConcat(list=>list)
// sink to produce kafka records in kafka
val kafkaSink: Sink[Evt, Future[Done]] = Flow[Evt]
.map(evt=> new ProducerRecord[Array[Byte], String](evt.eventType, evt.value))
.toMat(Producer.plainSink(kafka))(Keep.right)
val routes = {
path("ingest") {
post {
(entity(as[List[ReactiveEvent]]) & extractMaterializer) { (eventIngestList,mat) =>
val ingest= Source.single(eventIngestList).via(splitLines).runWith(kafkaSink)(mat)
val result = onComplete(ingest){
case Success(value) => complete(s"OK")
case Failure(ex) => complete((StatusCodes.InternalServerError, s"An error occurred: ${ex.getMessage}"))
}
complete("eventList ingested: " + result)
}
}
}
}
你能否突出我并行运行什么是顺序运行?
我认为mapConcat会对流中的事件进行顺序化,那么我怎样才能并行化流,以便在mapConcat之后并行处理每一步?
一个简单的mapAsyncUnordered是否足够?或者我应该使用带有余额和合并的GraphDSL吗?
答案 0 :(得分:2)
在你的情况下,我认为这将是顺序的。在你开始向Kafka推送数据之前,你也得到了完整的请求。我会使用extractDataBytes
指令给你src: Source[ByteString, Any]
。然后我会像
src
.via(Framing.delimiter(ByteString("\n"), 1024 /* Max size of line */ , allowTruncation = true).map(_.utf8String))
.mapConcat { line =>
line.split(",")
}.async
.runWith(kafkaSink)(mat)