Question

我对最终结果做出了很多计算，对贡献的顺序没有任何限制。似乎期货应该能够加快这一步，而且确实可以，但是我却无法做到。下面是比较整数除法的性能的代码：

import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}

object scale_me_up {
  def main(args: Array[String]) {
    val M = 500 * 1000
    val N = 5
    Thread.sleep(3210) // let launcher settle down
    for (it <- 0 until 15) {
      val method = it % 3
      val start = System.currentTimeMillis()
      val result = divide(M, N, method)
      val elapsed = System.currentTimeMillis() - start
      assert(result == M / N)
      if (it >= 6) {
        val methods = Array("ordinary", "fast parallel", "nice parallel")
        val name = methods(method)
        println(f"$name%15s: $elapsed ms")
      }
    }
  }

  def is_multiple_of(m: Int, n: Int): Boolean = {
    val result = !(1 until n).map(_ + (m / n) * n).toSet.contains(m)
    assert(result == (m % n == 0)) // yes, a less crazy implementation exists
    result
  }

  def divide(m: Int, n: Int, method: Int): Int = {
    method match {
      case 0 =>
        (1 to m).count(is_multiple_of(_, n))
      case 1 =>
        (1 to m)
          .map { x =>
            Future { is_multiple_of(x, n) }
          }
          .count(Await.result(_, Duration.Inf))
      case 2 =>
        Await.result(divide_futuristically(m, n), Duration.Inf)
    }
  }

  def divide_futuristically(m: Int, n: Int): Future[Int] = {
    val futures = (1 to m).map { x =>
      Future { is_multiple_of(x, n) }
    }
    Future.foldLeft(futures)(0) { (count, flag) =>
      { if (flag) { count + 1 } else { count } }
    }
    /* much worse performing alternative:
    Future.sequence(futures).map(_.count(identity))
    */
  }
}

运行此命令时，并行的case 1比普通的case 0代码（欢呼）要快一些，但是case 2花费的时间是原来的两倍。当然，这取决于系统以及将来是否需要完成足够的工作（在这里分母N会增加）。 [PS]如预期的那样，减少N可以使case 0领先，而增加N则可以使case 1和case 2的速度大约是我的两个核心CPU上case 0的两倍。 / p>

我相信divide_futuristically是表达这种计算的一种更好的方法：返回具有合并结果的Future。阻塞只是我们衡量性能所需要的。但实际上，障碍越多，每个人的完成速度就越快。我究竟做错了什么？总结期货的几种方法（例如sequence）都受到相同的惩罚。

Answer 1

似乎您所做的一切都正确。我自己尝试过不同的方法，甚至.par，但得到的结果相同或更糟。

我已经深入Future.foldLeft并试图分析造成延迟的原因：

  /** A non-blocking, asynchronous left fold over the specified futures,
   *  with the start value of the given zero.
   *  The fold is performed asynchronously in left-to-right order as the futures become completed.
   *  The result will be the first failure of any of the futures, or any failure in the actual fold,
   *  or the result of the fold.
   *
   *  Example:
   *  {{{
   *    val futureSum = Future.foldLeft(futures)(0)(_ + _)
   *  }}}
   *
   * @tparam T       the type of the value of the input Futures
   * @tparam R       the type of the value of the returned `Future`
   * @param futures  the `scala.collection.immutable.Iterable` of Futures to be folded
   * @param zero     the start value of the fold
   * @param op       the fold operation to be applied to the zero and futures
   * @return         the `Future` holding the result of the fold
   */
  def foldLeft[T, R](futures: scala.collection.immutable.Iterable[Future[T]])(zero: R)(op: (R, T) => R)(implicit executor: ExecutionContext): Future[R] =
    foldNext(futures.iterator, zero, op)

  private[this] def foldNext[T, R](i: Iterator[Future[T]], prevValue: R, op: (R, T) => R)(implicit executor: ExecutionContext): Future[R] =
    if (!i.hasNext) successful(prevValue)
    else i.next().flatMap { value => foldNext(i, op(prevValue, value), op) }

这部分：

else i.next().flatMap { value => foldNext(i, op(prevValue, value), op) }

.flatMap产生一个新的Future，并将其提交给executor。换句话说，每个

    { (count, flag) =>
      { if (flag) { count + 1 } else { count } }
    }

作为新的Future执行。

我想这部分会导致实验证明的延迟。

如何有效地将未来成果与未来结合起来

1 个答案: