多个分配(例如val (x, y) = (1, 2)
)在运行时的效率低于相应的单个分配(val x = 1; val y = 2
)吗?
我可以想象答案是肯定的,因为scala可能需要构建中间元组。这是对的吗?
如果我有一个额外的元组,例如, val tup = (1, 2)
现在更有效率:
(a)val (x, y) = tup
OR
(b)val x = tup._1; val y = tup._2
或者它们是一样的吗?
与前一个示例的不同之处在于不再需要分配RHS。
答案 0 :(得分:10)
您可以使用scala 2.9 REPL的新:javap
功能:
scala> class A { val (a, b) = (1, 2) }
scala> :javap -c A
Compiled from "<console>"
public class A extends java.lang.Object implements scala.ScalaObject{
...
public A();
Code:
0: aload_0
1: invokespecial #22; //Method java/lang/Object."<init>":()V
4: aload_0
5: new #24; //class scala/Tuple2$mcII$sp
8: dup
9: iconst_1
10: iconst_2
11: invokespecial #27; //Method scala/Tuple2$mcII$sp."<init>":(II)V
14: astore_1
15: aload_1
16: ifnull 68
19: aload_1
20: astore_2
21: new #24; //class scala/Tuple2$mcII$sp
24: dup
25: aload_2
26: invokevirtual #33; //Method scala/Tuple2._1:()Ljava/lang/Object;
29: invokestatic #39; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
32: aload_2
33: invokevirtual #42; //Method scala/Tuple2._2:()Ljava/lang/Object;
36: invokestatic #39; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
39: invokespecial #27; //Method scala/Tuple2$mcII$sp."<init>":(II)V
42: putfield #44; //Field x$1:Lscala/Tuple2;
45: aload_0
46: aload_0
47: getfield #44; //Field x$1:Lscala/Tuple2;
50: invokevirtual #47; //Method scala/Tuple2._1$mcI$sp:()I
53: putfield #14; //Field a:I
56: aload_0
57: aload_0
58: getfield #44; //Field x$1:Lscala/Tuple2;
61: invokevirtual #50; //Method scala/Tuple2._2$mcI$sp:()I
64: putfield #16; //Field b:I
67: return
68: new #52; //class scala/MatchError
71: dup
72: aload_1
73: invokespecial #55; //Method scala/MatchError."<init>":(Ljava/lang/Object;)V
76: athrow
}
scala> class B { val a = 1; val b = 2 }
scala> :javap -c B
Compiled from "<console>"
public class B extends java.lang.Object implements scala.ScalaObject{
...
public B();
Code:
0: aload_0
1: invokespecial #20; //Method java/lang/Object."<init>":()V
4: aload_0
5: iconst_1
6: putfield #12; //Field a:I
9: aload_0
10: iconst_2
11: putfield #14; //Field b:I
14: return
}
所以我猜答案是元组版本更慢。我想知道为什么有拳击正在进行,不应该随着元组的专业化而消失了吗?!
答案 1 :(得分:2)
由于缺乏基准测试,我并不满意,所以这里有一些使用https://github.com/sirthias/scala-benchmarking-template完成的基准测试,后者在后台使用Google Caliper。图表包含计算的(ns /内部循环执行),但文本结果直接来自控制台。代码:
package org.example
import annotation.tailrec
import com.google.caliper.Param
class Benchmark extends SimpleScalaBenchmark {
@Param(Array("10", "100", "1000", "10000"))
val length: Int = 0
var array: Array[Int] = _
override def setUp() {
array = new Array(length)
}
def timeRegular(reps: Int) = repeat(reps) {
var result = 0
array.foreach {value => {
val tuple = (value, value)
val (out1, out2) = tuple
result += out1
result += out2
}}
result
}
def timeUnpack(reps: Int) = repeat(reps) {
var result = 0
array.foreach {value =>{
val tuple = (value, value)
val out1 = tuple._1
val out2 = tuple._2
result += out1
result += out2
}}
result
}
def timeBoxedUnpack(reps: Int) = repeat(reps) {
var result = 0
array.foreach {value =>{
val tuple = (value, value, value)
val out1 = tuple._1
val out2 = tuple._2
val out3 = tuple._3
result += out1
result += out2
result += out3
}}
result
}
}
0% Scenario{vm=java, trial=0, benchmark=Regular, length=10} 102.09 ns; σ=1.04 ns @ 10 trials
8% Scenario{vm=java, trial=0, benchmark=Unpack, length=10} 28.23 ns; σ=0.27 ns @ 6 trials
17% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10} 110.17 ns; σ=1.95 ns @ 10 trials
25% Scenario{vm=java, trial=0, benchmark=Regular, length=100} 909.73 ns; σ=6.42 ns @ 3 trials
33% Scenario{vm=java, trial=0, benchmark=Unpack, length=100} 271.40 ns; σ=1.35 ns @ 3 trials
42% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=100} 946.59 ns; σ=8.38 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=Regular, length=1000} 8966.33 ns; σ=40.17 ns @ 3 trials
58% Scenario{vm=java, trial=0, benchmark=Unpack, length=1000} 2517.54 ns; σ=4.56 ns @ 3 trials
67% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=1000} 9374.71 ns; σ=68.25 ns @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=Regular, length=10000} 81244.84 ns; σ=661.81 ns @ 3 trials
83% Scenario{vm=java, trial=0, benchmark=Unpack, length=10000} 23502.73 ns; σ=122.83 ns @ 3 trials
92% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10000} 112683.27 ns; σ=1101.51 ns @ 4 trials
length benchmark ns linear runtime
10 Regular 102.1 =
10 Unpack 28.2 =
10 BoxedUnpack 110.2 =
100 Regular 909.7 =
100 Unpack 271.4 =
100 BoxedUnpack 946.6 =
1000 Regular 8966.3 ==
1000 Unpack 2517.5 =
1000 BoxedUnpack 9374.7 ==
10000 Regular 81244.8 =====================
10000 Unpack 23502.7 ======
10000 BoxedUnpack 112683.3 ==============================
0% Scenario{vm=java, trial=0, benchmark=Regular, length=10} 28.26 ns; σ=0.13 ns @ 3 trials
8% Scenario{vm=java, trial=0, benchmark=Unpack, length=10} 28.27 ns; σ=0.07 ns @ 3 trials
17% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10} 109.56 ns; σ=2.27 ns @ 10 trials
25% Scenario{vm=java, trial=0, benchmark=Regular, length=100} 273.40 ns; σ=2.73 ns @ 5 trials
33% Scenario{vm=java, trial=0, benchmark=Unpack, length=100} 271.25 ns; σ=2.63 ns @ 6 trials
42% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=100} 1088.00 ns; σ=10.60 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=Regular, length=1000} 2516.30 ns; σ=7.13 ns @ 3 trials
58% Scenario{vm=java, trial=0, benchmark=Unpack, length=1000} 2525.00 ns; σ=24.25 ns @ 6 trials
67% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=1000} 10188.98 ns; σ=101.32 ns @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=Regular, length=10000} 25886.80 ns; σ=116.33 ns @ 3 trials
83% Scenario{vm=java, trial=0, benchmark=Unpack, length=10000} 25938.97 ns; σ=76.02 ns @ 3 trials
92% Scenario{vm=java, trial=0, benchmark=BoxedUnpack, length=10000} 115629.82 ns; σ=1159.41 ns @ 5 trials
length benchmark ns linear runtime
10 Regular 28.3 =
10 Unpack 28.3 =
10 BoxedUnpack 109.6 =
100 Regular 273.4 =
100 Unpack 271.2 =
100 BoxedUnpack 1088.0 =
1000 Regular 2516.3 =
1000 Unpack 2525.0 =
1000 BoxedUnpack 10189.0 ==
10000 Regular 25886.8 ======
10000 Unpack 25939.0 ======
10000 BoxedUnpack 115629.8 ==============================
只要元组arity&lt; = 2,解包元组就会很快。如果它大于2,则间接太多并且Hotspot编译器需要优化。
Scala 2.9.2存在某些奇怪的问题,使得使用元组的分配比常规赋值更快。很奇怪,但它可能会被忽视。
这是使用
完成的java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
答案 2 :(得分:0)
只需执行所有选项一百万次,并通过调用System.currentTimeMillis来测量所需的时间。从理论上讲,多项任务应该效率较低,但可能会被优化掉。