我需要在Spark中使用2个非常大(> 1GB)的ByteArray(所以使用Scala)。
我寻找最有效的方式(在速度和记忆方面),这意味着我不想使用像“拉链”这样的东西。将我的数组转换为列表的方法。
目前,我使用以下方法,但我想知道你们中有些人是否有其他想法...
def bitor(x: Array[Byte], y: Array[Byte]) : Array[Byte] = {
for(i <- 0 to x.size) {
x(i) = (x(i) | y(i)).toByte
}
return x
}
我应该通过JNI并在本地C中进行计算吗?
答案 0 :(得分:0)
我假设您的代码在分布式环境中运行。如果是这样,我认为最好的选择是使用parallel collections API
并行集合使用计算机的多核硬件执行任务,并为开发人员提供简单而透明的工作。
我认为这种方法的主要优点是,如果您向云服务添加更多硬件并且您不需要更改任何内容,您的代码就会准备就绪。
我对你的代码和并行实现进行了一些测试。 请注意,我使用Scala REPL
在我的个人计算机上进行了这些测试import scala.collection.parallel.mutable.ParArray
import scala.util.Random
// prepare arrays
val rnd = Random
// parallel arrays
val pArr1 = ParArray.tabulate(20000)(x => rnd.nextInt(100).toByte)
val pArr2 = ParArray.tabulate(20000)(x => rnd.nextInt(100).toByte)
// common arrays
val arr1 = pArr1.toArray
val arr2 = pArr2.toArray
println(pArr1)
println(pArr2)
println(arr1)
println(arr2)
println("Variables loaded")
// define parallel task
def parallel(arr1: ParArray[Byte], arr2: ParArray[Byte]): Unit = {
val start = System.currentTimeMillis
val r = (arr1 zip arr2).map(x => x._1 | x._2)
//println(r)
println(s"Execution time: ${System.currentTimeMillis - start}")
}
// define single thread task
def bitor(x: Array[Byte], y: Array[Byte]): Unit = {
val start = System.currentTimeMillis
for (i <- 0 until x.size) {
x(i) = (x(i) | y(i)).toByte
}
//x.foreach(println)
println(s"Execution time: ${System.currentTimeMillis - start}")
// return x
}
println("functions defined")
我在1到100之间生成20 000个随机数并将它们转换为字节。
之后,我执行了20次每个方法(并行和单线程),执行如下:
> (1 to 20).foreach(x => parallel(pArr1, pArr2))
// parallel method (in milliseconds)
1) Execution time: 10
2) Execution time: 3
3) Execution time: 6
4) Execution time: 4
5) Execution time: 29
6) Execution time: 4
7) Execution time: 4
8) Execution time: 3
9) Execution time: 3
10) Execution time: 6
11) Execution time: 1
12) Execution time: 2
13) Execution time: 1
14) Execution time: 1
15) Execution time: 4
16) Execution time: 1
17) Execution time: 1
18) Execution time: 2
19) Execution time: 1
20) Execution time: 1
Avg(11 to 20) = 1.5 milliseconds
// --------------------------------------------- --------------------
(1 to 20).foreach(x => bitor(arr1, arr2))
// bitor method (in milliseconds)
1) Execution time: 1
2) Execution time: 0
3) Execution time: 0
4) Execution time: 1
5) Execution time: 0
6) Execution time: 0
7) Execution time: 1
8) Execution time: 0
9) Execution time: 0
10) Execution time: 3
11) Execution time: 0
12) Execution time: 0
13) Execution time: 0
14) Execution time: 0
15) Execution time: 2
16) Execution time: 0
17) Execution time: 3
18) Execution time: 0
19) Execution time: 1
20) Execution time: 0
Avg(11 to 20) = 0.6 milliseconds
由于JIT编译器的准备,我删除了前十次执行。 See more here
正如您所看到的, bitor 方法比并行方法快一点,我不确定并行方法是否会通过更好的解决方案进行优化并行API,但我认为在分布式云环境中,并行方法应该比 bitor 更快。
答案 1 :(得分:0)
使用foreach
desugars的代码相当于此java代码:
public final class _$$anon$1$$anonfun$bitor$1 extends AbstractFunction1$mcVI$sp implements Serializable {
private final byte[] x$1;
private final byte[] y$1;
public _$$anon$1$$anonfun$bitor$1(byte[] x$1, byte[] y$1) {
this.x$1 = x$1;
this.y$1 = y$1;
}
@Override
public final void apply(final int i) {
this.apply$mcVI$sp(i);
}
@Override
public void apply$mcVI$sp(final int i) {
this.x$1[i] |= this.y$1[i];
}
}
private byte[] bitor(final byte[] x, final byte[] y) {
RichInt.to$extension0(Predef.intWrapper(0), Predef.byteArrayOps(x).size())
.foreach(new _$$anon$1$$anonfun$bitor$1(x, y));
return x;
}
但是,如果您将for
理解替换为while
,事情会发生变化:
def bitor(x: Array[Byte], y: Array[Byte]) : Array[Byte] = {
var i = 0
while (i < x.length) {
x(i) = (x(i) | y(i)).toByte
i += 1
}
x
}
转化为:
private byte[] bitor(final byte[] x, final byte[] y) {
for (int i = 0; i < x.length; ++i) {
x[i] |= y[i];
}
return x;
}