调用函数内部或外部过滤器

时间:2016-11-12 19:49:10

标签: performance scala optimization

让我们说我有以下两个功能相同的代码片段,它们返回列表中也有反转的字符串列表:

var a = Array("abc", "bca", "abc", "aba", "cba")
a.filter(x => a.toSet(x.reverse)).distinct

var a = Array("abc", "bca", "abc", "aba", "cba")
var aSet = a.toSet  // notice that toSet is called outside filter
a.filter(x => aSet(x.reverse)).distinct

我想知道这些代码段之间的时间复杂度是否存在差异,因为在第一个代码段中,我为.toSet中的每个元素调用a,而在第二个代码段中,我只在开始时调用它。然而,话虽如此,有些东西告诉我编译器可能会优化第一个调用,从而产生相当于时间复杂度的2个片段。

如果后者属实,请您参考一些相关文献?

谢谢。

1 个答案:

答案 0 :(得分:5)

好吧,让我们进行测试(使用Scalameter):

import org.scalameter.{Gen, PerformanceTest}
import org.scalatest._

import scala.collection.mutable

class SOPerformance extends PerformanceTest.Quickbenchmark {
  val gen = Gen.unit("unit")
    @inline def fn = {
        var a = Array("abc", "bca", "abc", "aba", "cba")
    a.filter(x => a.toSet(x.reverse)).distinct
    }
    @inline def fn2 = {
        var a = Array("abc", "bca", "abc", "aba", "cba")
        var aSet = a.toSet  // notice that toSet is called outside filter
        a.filter(x => aSet(x.reverse)).distinct
    }

  performance of "Range" in {
    measure method "fn" in {
      using(gen) in { gen ⇒
        fn
      }
    }
    measure method "fn2" in {
      using(gen) in { gen ⇒
        fn2
      }
    }
  }
}

这表明fn平均运行在0.005674毫安,fn2平均运行0.003903毫秒。

现在让我们把这个数组放大一点!

import org.scalameter.{Gen, PerformanceTest}
import org.scalatest._

import scala.collection.mutable

class SOPerformance extends PerformanceTest.Quickbenchmark {
  var a = (1 to 1000).map(_.toString).toArray

  val gen = Gen.unit("unit")

    @inline def fn = {
        a.filter(x => a.toSet(x.reverse)).distinct
    }
    @inline def fn2 = {
        var aSet = a.toSet  // notice that toSet is called outside filter
        a.filter(x => aSet(x.reverse)).distinct
    }

  performance of "Range" in {
    measure method "fn" in {
      using(gen) in { gen ⇒
        fn
      }
    }
    measure method "fn2" in {
      using(gen) in { gen ⇒
        fn2
      }
    }
  }
}

这显示了真正的杀手。 fn平均需要158.241861 ms,而fn2需要0.353472 ms!为什么?因为创建集合真的很贵!特别是需要制作新HashSet的集合需要垃圾收集等等。