我正在尝试在Scala中反转包含unicode字符的字符串。 我想找到最快的方法。 到目前为止,我有这段代码:
import scala.runtime.RichChar
def reverseInPlace(s: String): String = {
val reverse = new Array[RichChar](s.length)
for (i <- (0 to (s.length >> 1))) {
reverse(i) = s(s.length -i -1)
reverse(s.length -i -1) = s(i)
}
return reverse.mkString
}
def reverseLeft(s: String): String = s.foldLeft("") ( (a,b) =>
b + a
)
def reverseRight(s: String): String = s.foldRight("") ( (a,b) =>
b + a
)
def time[R](iterations:Int, block: => R) = {
val t0 = System.nanoTime()
for ( i <- 0 to iterations){
block // call-by-name
}
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0) + "ns")
}
time(1000, {
reverseRight("Hello\u0041")
})
time(1000, {
reverseInPlace("Hello\u0041")
})
time(1000, {
reverseLeft("Hello\u0041")
})
time(1000, {
"Hello\u0041".reverse
})
在我的macbook 2013上,我得到了这些结果:
Elapsed time: 37013000ns
Elapsed time: 23592000ns
Elapsed time: 11647000ns
Elapsed time: 5579000ns
但我觉得这些数字是虚假的数字。 我可以使用scala,sbt和JMH库对谁进行正确的基准测试?
注意:正如评论所指出的,Java中的微基准测试是一项非常重要的业务 见(How do I write a correct micro-benchmark in Java?)和https://groups.google.com/d/msg/mechanical-sympathy/m4opvy4xq3U/7lY8x8SvHgwJ。为什么你不应该在不使用外部库的情况下尝试微基准测试。
答案 0 :(得分:3)
这是一个使用 no 框架的解决方案,但是我写的Thyme是因为我想要一个微基准测试框架来感受微观,而不是像大象一样。
scala -cp /jvm/Thyme.jar
是您在REPL中运行它所需的全部工作。
现在我们需要一个实际可行的实现。我会写两个。
首先尝试:
def revStr(s: String): String = {
val points = for (i <- s.indices if !s(i).isLowSurrogate) yield s.codePointAt(i)
new String(points.toArray.reverse,0,points.length)
}
不是那么难。但是可能会很慢;可能很多拳击那里。我们试试一个更快的版本:
def reverseString(s: String): String = if (s.length < 2) s else {
import java.lang.Character.{isLowSurrogate => lo, isHighSurrogate => hi}
val chars = s.toCharArray
var i = 0
var j = s.length - 1
var swapped = false
while (i < j) {
swapped = false
val a = chars(i)
val b = chars(j)
if (lo(a) && j+1 < s.length && hi(chars(j+1))) {
chars(j) = chars(j+1)
chars(j+1) = a
swapped = true
}
else chars(j) = a
if (hi(b) && i > 0 && lo(chars(i-1))) {
chars(i) = chars(i-1)
chars(i-1) = b
swapped = true
}
else chars(i) = b
i += 1
j -= 1
}
if (i==j) {
val c = chars(i)
if (lo(c) && j+1 < s.length && hi(chars(j+1))) {
chars(j) = chars(j+1)
chars(j+1) = c
}
else if (hi(c) && i > 0 && lo(chars(i-1))) {
chars(i) = chars(i-1)
chars(i-1) = c
}
}
else if (!swapped && hi(chars(i)) && lo(chars(j))) {
val temp = chars(i)
chars(i) = chars(j)
chars(j) = temp
}
new String(chars)
}
哎哟。这是为了速度不易使用而写的,但是哎哟。
无论如何,让我们测试一下。我不是在这里做充分的热身,但我们会得到一个想法:
scala> val th = new ichi.bench.Thyme
th: ichi.bench.Thyme = ichi.bench.Thyme@174580e6
scala> val testString = "This is a \ud800\udc00 test!"
testString: String = This is a test!
scala> val wrong = th.pbench{ testString.reverse }
Benchmark (327660 calls in 115.2 ms)
Time: 164.8 ns 95% CI 157.4 ns - 172.3 ns (n=19)
Garbage: 97.66 ns (n=2 sweeps measured)
wrong: String = !tset ?? a si sihT
scala> val slow = th.pbench{ revStr(testString) }
Benchmark (163820 calls in 467.2 ms)
Time: 749.0 ns 95% CI 742.5 ns - 755.5 ns (n=18)
Garbage: 2.112 us (n=2 sweeps measured)
slow: String = !tset a si sihT
scala> val fast = th.pbench{ reverseString(testString) }
Benchmark (327660 calls in 36.32 ms)
Time: 58.19 ns 95% CI 58.10 ns - 58.27 ns (n=18)
Garbage: 12.21 ns (n=1 sweeps measured)
fast: String = !tset a si sihT
scala> val compare = th.pbenchOff(){revStr(testString)}{reverseString(testString)}
Benchmark comparison (in 430.7 ms)
Significantly different (p ~= 0)
Time ratio: 0.09495 95% CI 0.08061 - 0.10928 (n=20)
First 777.9 ns 95% CI 756.0 ns - 799.8 ns
Second 73.86 ns 95% CI 62.90 ns - 84.81 ns
Individual benchmarks not fully consistent with head-to-head (p ~= 0)
First 742.9 ns 95% CI 742.0 ns - 743.9 ns
Second 58.33 ns 95% CI 58.19 ns - 58.46 ns
compare: String = !tset a si sihT
因此,总之,如果要进行微基准测试,请至少使用最小的微基准测试工具。
此外,代码点很烦人,直接阵列操作也很快。