Question

我试图理解Future s，所以我写了一个Summer类来划分列表，在n个不同的期货中执行它，并结合结果。它比非分割版本慢了5倍，我想知道原因。这是我的基准：

import java.util.Date

object SummerMain {
    def main(args: Array[String]) = {
        val xs = List.fill(10000000)(1)

        println("Starting")

        val t = Timer()
        val x = xs.foldLeft(0)(_+_)
        val time = t.stop

        println(s"Sum: ${x}, time: ${time} ms")
    }
}

case class Timer(startTime: Long = new Date().getTime) {
    private def curMs: Long = new Date().getTime

    def restart: Timer = Timer(curMs)
    def stop: Long = curMs - startTime
    def lap: (Long, Timer) = { val curTime = curMs
        (curTime - startTime,Timer(curTime))
    }

}

平均运行时间约为790毫秒。

但这需要大约4.5秒：

import scala.concurrent._
import duration._
import ExecutionContext.Implicits.global

object SummerMain {
    def main(args: Array[String]) = {
        val s = Summer(
                    xs = List.fill(10000000)(1),
                    nParts = 5 // The number of futures to divide it over
                )

        println("Starting")

        val t = Timer()
        val x = s.breakSum
        val time = t.stop

        println(s"Sum: ${x}, time: ${time} ms")
    }
}

case class Summer(xs: List[Int], nParts: Int) {
    lazy val elemsPer = (xs.length / nParts) + 1

    def sum(xs: List[Int]): Long =
        xs.foldLeft(0)(_+_)

    def break(ys: List[Int]): List[List[Int]] = ys match {
        case Nil    => List()
        case zs     => (zs take elemsPer) :: break(zs drop elemsPer)
    }

    def breakSum: Long = {
        val futures: List[Future[Long]] = break(xs) map { ys =>
                Future( sum(ys) )
        }

        var s: Long = 0L

        for ( f <- futures ) {
            s += Await.result(f, 10 hours)
        }

        s
    }
}

我的算法效率如此低，以至于它弥补了收益，还是我错误地使用Future？

Answer 1

您尝试并行化的+操作是非常快。

基本上，它只需要花费时间来处理所有元素，因为+整数只占用1个CPU周期。无与伦比的。

问题在于，分解原始列表的操作比仅仅总结所有元素需要更多时间：您需要分配新内存和，您需要遍历所有元素list（将它们放入新列表中），这就足以让结果首先出现！

一旦列表被拆分，提交Runnable会触发创建5个线程，这是一个非空闲的操作，从而带来新的开销。只有这样，并行化可能比非并发版本更快。

并行化在并行化昂贵的操作时非常有用。在快速操作中，最好在此期间使用一个CPU的所有功能并使用其他CPU执行其他操作。

Answer 2

问题

break方法效率极低。

case zs     => (zs take elemsPer) :: break(zs drop elemsPer)

这段代码创建了两个新列表，这些列表本身比简单地汇总项目需要更多时间。

可能的解决方案

将数字存储在索引结构（例如Array或IndexedSeq）中，并将开始和结束索引传递给每个线程。线程应该计算给定索引之间的总和，但是来自相同的集合。

改进空间

    for ( f <- futures ) {
        s += Await.result(f, 10 hours)
    }

可以改进上述代码以利用最大并行度。

Future.reduce会在结果到来时合并结果，这可能会带来更好的结果：

val sum = Future.reduce(futures)(_ + _)
Await.result(sum, 10 hours)

为什么使用Future的这个例子比基准测试更快？

2 个答案:

问题

可能的解决方案

改进空间