Scala部分功能如何工作?

时间:2019-04-29 17:24:21

标签: scala functional-programming

// First using normal dictionary lookup
def findElement(e: String, dict: Map[String, Any]): Option[Any] = dict.get(e)
;

// Second using Partial function
def findElement(e: String, dict: Map[String, Any]): Option[Any] = dict.find { case (k, v) => k == e } map (_._2)


他们都给出了相同的答案,但是第二个函数如何工作?

使用case关键字的部分函数的BigO是什么?是否会遍历地图的所有元素以找到正确的键?

3 个答案:

答案 0 :(得分:4)

您需要了解的几件事:

  • Map[A, B]也是PartialFunction[A, B]
  • 部分函数具有lift方法,可将其转换为A => Option[B]-get基本上变为apply.lift _
  • Map也可以视为成对的序列(Seq[(A, B)])-您在mapflatMapcollect,{{ 1}}等
  • find是一个函数,它返回集合的第一个元素(在find的情况下,它是一对)-如果集合中没有这样的元素,则Map处理
  • None使用模式匹配从元组中提取值并将其放入值{ case (k,v) => }k中,
  • v是一个元组方法(返回第二个值)。

请牢记:

_._2

......很明显-dict.get(e)
 键的返回值,如果存在,则将其包装在e中,否则返回SomeNone会丢失值)。

apply

它将尝试查找第一个元素dict.find { case (k, v) => k == e } map (_._2)
 ,返回k == e,然后使用Option[(String, Any)]通过旋转整个元组来转换map中的值(如果存在)只是它的第二个值。

答案 1 :(得分:4)

dict.get(e) = O(1)或最多 O(N)取决于Map如何处理冲突或哈希码的分布方式 {。{1}} = O(N)(归因于.find实现)。如果我们深入研究方法查找实现,我们将看到:

dict.find { case (k, v) => k == e } map (_._2)

再深入一点:

override /*TraversableLike*/ def find(p: A => Boolean): Option[A] =
    iterator.find(p)

,在这里我们可以看到while循环迭代Iterator中的所有元素,直到传递的函数 def find(p: A => Boolean): Option[A] = { while (hasNext) { val a = next() if (p(a)) return Some(a) } None } 返回true为止。因此,我们在这里最多有N次迭代。

我不是很懒,并使用 sbt-jmh 编写了基准:

p: A => Boolean

运行它:

import java.util.concurrent.TimeUnit
import org.openjdk.jmh.annotations.{Benchmark, OutputTimeUnit, Scope, State}

@State(Scope.Benchmark)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
class FindElementBenchmark {

  val dict: Map[String, Any] =
    (0 to 100).foldLeft(Map.empty[String, Any])((m, i) => m + (s"key$i" ->s"value$i"))

  val e: String = "key99"

  // First using normal dictionary lookup
  @Benchmark
  def findElementDict: Option[Any] =
    dict.get(e)

  // Second using Partial function
  @Benchmark
  def findElementPF: Option[Any] =
    dict
      .find { case (k, v) => k == e }
      .map(_._2)
}

并获得结果:

$ sbt
$ sbt:benchmarks> jmh:run -i 20 -wi 10 -f1 -t1

我们可以看到[info] Running (fork) org.openjdk.jmh.Main -i 20 -wi 10 -f1 -t1 [info] # JMH version: 1.21 [info] # VM version: JDK 1.8.0_161, Java HotSpot(TM) 64-Bit Server VM, 25.161-b12 [info] # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre/bin/java [info] # VM options: <none> [info] # Warmup: 10 iterations, 10 s each [info] # Measurement: 20 iterations, 10 s each [info] # Timeout: 10 min per iteration [info] # Threads: 1 thread, will synchronize iterations [info] # Benchmark mode: Throughput, ops/time [info] # Benchmark: bmks.FindElementBenchmark.findElementDict [info] # Run progress: 0.00% complete, ETA 00:10:00 [info] # Fork: 1 of 1 [info] # Warmup Iteration 1: 48223.037 ops/ms [info] # Warmup Iteration 2: 48570.873 ops/ms [info] # Warmup Iteration 3: 48730.899 ops/ms [info] # Warmup Iteration 4: 45050.838 ops/ms [info] # Warmup Iteration 5: 48191.539 ops/ms [info] # Warmup Iteration 6: 48464.603 ops/ms [info] # Warmup Iteration 7: 48690.140 ops/ms [info] # Warmup Iteration 8: 46432.571 ops/ms [info] # Warmup Iteration 9: 46772.835 ops/ms [info] # Warmup Iteration 10: 47214.496 ops/ms [info] Iteration 1: 49149.297 ops/ms [info] Iteration 2: 48476.424 ops/ms [info] Iteration 3: 48590.436 ops/ms [info] Iteration 4: 48214.015 ops/ms [info] Iteration 5: 48698.636 ops/ms [info] Iteration 6: 48686.357 ops/ms [info] Iteration 7: 48948.054 ops/ms [info] Iteration 8: 48917.577 ops/ms [info] Iteration 9: 48872.980 ops/ms [info] Iteration 10: 48970.421 ops/ms [info] Iteration 11: 46269.031 ops/ms [info] Iteration 12: 44934.335 ops/ms [info] Iteration 13: 46279.314 ops/ms [info] Iteration 14: 47721.223 ops/ms [info] Iteration 15: 46238.490 ops/ms [info] Iteration 16: 47453.282 ops/ms [info] Iteration 17: 47886.762 ops/ms [info] Iteration 18: 48032.580 ops/ms [info] Iteration 19: 48142.064 ops/ms [info] Iteration 20: 48460.665 ops/ms [info] Result "bmks.FindElementBenchmark.findElementDict": [info] 47947.097 ±(99.9%) 1003.440 ops/ms [Average] [info] (min, avg, max) = (44934.335, 47947.097, 49149.297), stdev = 1155.563 [info] CI (99.9%): [46943.657, 48950.537] (assumes normal distribution) [info] # JMH version: 1.21 [info] # VM version: JDK 1.8.0_161, Java HotSpot(TM) 64-Bit Server VM, 25.161-b12 [info] # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre/bin/java [info] # VM options: <none> [info] # Warmup: 10 iterations, 10 s each [info] # Measurement: 20 iterations, 10 s each [info] # Timeout: 10 min per iteration [info] # Threads: 1 thread, will synchronize iterations [info] # Benchmark mode: Throughput, ops/time [info] # Benchmark: bmks.FindElementBenchmark.findElementPF [info] # Run progress: 50.00% complete, ETA 00:05:00 [info] # Fork: 1 of 1 [info] # Warmup Iteration 1: 7261.136 ops/ms [info] # Warmup Iteration 2: 7548.525 ops/ms [info] # Warmup Iteration 3: 7517.692 ops/ms [info] # Warmup Iteration 4: 7126.543 ops/ms [info] # Warmup Iteration 5: 7732.285 ops/ms [info] # Warmup Iteration 6: 7525.456 ops/ms [info] # Warmup Iteration 7: 7739.055 ops/ms [info] # Warmup Iteration 8: 7555.671 ops/ms [info] # Warmup Iteration 9: 7624.464 ops/ms [info] # Warmup Iteration 10: 7527.114 ops/ms [info] Iteration 1: 7631.426 ops/ms [info] Iteration 2: 7607.643 ops/ms [info] Iteration 3: 7636.029 ops/ms [info] Iteration 4: 7413.881 ops/ms [info] Iteration 5: 7726.417 ops/ms [info] Iteration 6: 7410.291 ops/ms [info] Iteration 7: 7452.339 ops/ms [info] Iteration 8: 7825.050 ops/ms [info] Iteration 9: 7801.677 ops/ms [info] Iteration 10: 7783.978 ops/ms [info] Iteration 11: 7788.909 ops/ms [info] Iteration 12: 7778.982 ops/ms [info] Iteration 13: 7784.158 ops/ms [info] Iteration 14: 7771.173 ops/ms [info] Iteration 15: 7750.280 ops/ms [info] Iteration 16: 7813.570 ops/ms [info] Iteration 17: 7845.550 ops/ms [info] Iteration 18: 7841.003 ops/ms [info] Iteration 19: 7808.576 ops/ms [info] Iteration 20: 7847.100 ops/ms [info] Result "bmks.FindElementBenchmark.findElementPF": [info] 7715.902 ±(99.9%) 124.303 ops/ms [Average] [info] (min, avg, max) = (7410.291, 7715.902, 7847.100), stdev = 143.148 [info] CI (99.9%): [7591.598, 7840.205] (assumes normal distribution) [info] # Run complete. Total time: 00:10:01 [info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on [info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial [info] experiments, perform baseline and negative tests that provide experimental control, make sure [info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. [info] Do not assume the numbers tell you what you want them to tell. [info] Benchmark Mode Cnt Score Error Units [info] FindElementBenchmark.findElementDict thrpt 20 47947.097 ± 1003.440 ops/ms [info] FindElementBenchmark.findElementPF thrpt 20 7715.902 ± 124.303 ops/ms [success] Total time: 603 s, completed Apr 30, 2019 7:33:10 PM 的得分差了7倍。我刚刚从理论上证明了算法复杂度评估。

答案 2 :(得分:3)

编辑:OP已编辑他们的问题,以删除对“部分应用的函数”的引用,因此各节不再相关。我将它们留在这里,因为这对于其他人将它们混在一起可能是很有价值的。

那不是部分应用的函数,而是PartialFunction(https://www.scala-lang.org/api/current/scala/PartialFunction.html)。

部分应用函数是具有多个参数的函数,您仅提供了一部分参数,您可以将该部分应用函数交给其他人来提供其余部分。

文档中的部分函数定义为:

  

PartialFunction [A,B]类型的局部函数是一元函数,其中域不一定包含所有类型A的值。函数isDefinedAt允许动态测试值是否在函数的域中。 / p>

您提供的案例应该涵盖所有案例,因为您的域是所有Tuple2,因为您没有防范特定的值,但是并不强制您在PartialFunction中涵盖所有案例。