Scala:为什么这个Future.traverse很慢?身体没有平行运行

时间:2017-09-27 00:51:40

标签: scala

val startTime = System.currentTimeMillis
for {
  data <- Future.traverse(lessons) { lesson =>
    val startTime2 = System.currentTimeMillis
    for ( f <- analyticsService.getData(lesson.id) ) yield {
      println(s"Future took ${startTime2 - System.currentTimeMillis}"); f
    }
  }
} yield {
  println(s"end ${startTime - System.currentTimeMillis}")
  data
}

analyticsService.getData(lesson.id)似乎是顺序运行而不是并行运行。为了显示此案例,我将analyticsService.getData(lesson.id)替换为Future { Thread.sleep(1000) },打印结果为

Future took -1000
Future took -1001
Future took -1000
Future took -1001
Future took -1001
Future took -1001
Future took -1001
Future took -2001
Future took -2001
Future took -2002
Future took -2002
Future took -2001
Future took -3000
Future took -3001
Future took -3002
Future took -3002
Future took -3002
Future took -4000
Future took -4001
Future took -4002
Future took -4002
Future took -4002
Future took -4002
Future took -4002
Future took -5001
Future took -5001
Future took -5002
Future took -5002
Future took -5002
Future took -5004
Future took -5004
....
Future took -62601
Future took -62601
Future took -62601
Future took -62601
Future took -62601
Future took -62601
Future took -62601
Future took -62600
Future took -63597
Future took -63598
Future took -63598
Future took -63599
Future took -63601
Future took -63601

从这看起来看起来批量的期货是并行运行但是有什么方法可以提高速度呢?这可以更平行地运行吗?

2 个答案:

答案 0 :(得分:1)

因为analyticsService.getData(lesson.id)很慢,很可能。

作业val a = analyticsService.getData(lesson.id)会立即返回 ,因为(我在这里稍作跳跃)analyticsService.getData(lesson.id)Future。这意味着,当您到达第一个println时,analyticsService.getData(lesson.id) 中的执行未完成,但已在Future内调度。

println内的yield语句直到很久才执行,因为它只在Future.traverse完成后才会发生,这发生在所有子期货完成之后。

答案 1 :(得分:0)

analyticsService.getData(lesson.id)可能会在Future内阻止很长一段时间的操作。 Thread.sleep就是一个例子。

您可以在blocking { ... }中导出该代码(导入为scala.concurrent.blocking),以提示ExecutionContext,以便它可以分配更多主题。

或者,您可以使用其他ExecutionContext来阻止I / O任务,例如无限制的线程池:

import scala.concurrent._
import java.util.concurrent.{SynchronousQueue, ThreadPoolExecutor, TimeUnit}


implicit val ec: ExecutionContext =
  ExecutionContext.fromExecutor(new ThreadPoolExecutor(
    0, Int.MaxValue,
    60, TimeUnit.SECONDS,
    new SynchronousQueue[Runnable](false)))

/* your code here */

(这是从Scheduler.io factory method from Monix)公然被盗的

最后,在现实生活场景中,如果您可以analyticsService.getData(lesson.id)接受多个ID,则可以显着加快所有内容。