如何使用跳过和条件停止来实现流

时间:2018-06-29 19:24:47

标签: akka-stream

我尝试实现批处理。我的算法:

1)首先,我需要从db开始的请求项,skip = 0。如果没有项目,则完全停止处理。

  case class Item(i: Int)

  def getItems(skip: Int): Future[Seq[Item]] = {
    Future((skip until (skip + (if (skip < 756) 100 else 0))).map(Item))
  }

2)然后对每一项进行繁重的工作(parallelism = 4

  def heavyJob(item: Item): Future[String] = Future {
    Thread.sleep(1000)
    item.i.toString + " done"
  }

3)处理完所有项目后,使用skip += 100进入1步

我要做什么:

val dbSource: Source[List[Item], _] = Source.fromFuture(getItems(0).map(_.toList))

val flattened: Source[Item, _] = dbSource.mapConcat(identity)

val procced: Source[String, _] = flattened.mapAsync(4)(item => heavyJob(item))

procced.runWith(Sink.onComplete(t => println("Complete: " + t.isSuccess)))

但是我不知道如何实现分页

1 个答案:

答案 0 :(得分:0)

可以将skip作为值的基础来处理Iterator增量:

val skipIncrement = 100

val skipIterator : () => Iterator[Int] = 
  () => Iterator from (0, skipIncrement)

然后可以使用此Iterator来驱动akka Source,该akka获取项目并继续处理,直到查询返回空Seq

val databaseStillHasValues : Seq[Item] => Boolean = 
  (dbValues) => !dbValues.isEmpty

val itemSource : Source[Item, _] = 
  Source.fromIterator(skipIterator)
        .mapAsync(1)(getItems)
        .takeWhile(databaseStillHasValues)
        .mapConcat(identity)

heavyJob可以在Flow中使用:

val heavyParallelism = 4

val heavyFlow : Flow[Item, String, _] = 
  Flow[Item].mapAsync(heavyParallelism)(heavyJob)

最后,源和流可以附加到Sink

val printSink = Sink[String].foreach(t => println(s"Complete: ${t.isSuccess}"))

itemSource.via(heavyFlow)
          .runWith(printSink)