复制阵列性能

时间:2014-11-16 23:35:35

标签: java performance

我想将两个矩阵相乘,所以我决定将矩阵分成几个部分。我写了两个不同的matriceSplit函数,但我很困惑。我的一个函数使用system arraycopy,另一个函数使用for循环。我发现for loop的运行速度比arraycopy方法快。

 private static int[][] getPartOfMatrix(int[][] matrix, int size, int part) {

        int[][] newMatrix = new int[size][matrix[0].length];

        for (int i = part * size; i < (part + 1) * size; i++) {
            System.arraycopy(matrix[i], 0, newMatrix[i], 0, matrix[i].length);
        }

        return newMatrix;
    }

 private static int[][] getPartOfMatrix2(int[][] matrix, int size, int part) {

        int[][] newMatrix = new int[size][matrix[0].length];

        for (int i = part * size, r = 0; i < (part + 1) * size; i++, r++) {
            for (int j = 0; j < matrix[0].length; j++) {
                newMatrix[r][j] = matrix[i][j];
            }
        }

        return newMatrix;
    }

我应该使用哪种?为什么?

1 个答案:

答案 0 :(得分:1)

package tests;

import org.openjdk.jmh.annotations.*;

import java.util.concurrent.TimeUnit;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class CopyArray implements UnsafeConstants {

    @Param({"0", "1", "10", "16", "1000", "1024", "8192"})
    public int arraySize;
    public int[] a;
    public int[] copy;

    @Setup
    public void setup() {
        a = new int[arraySize];
        copy = new int[arraySize];
    }

    @Benchmark
    public int[] arrayCopy(CopyArray state) {
        int[] a = state.a;
        int[] copy = state.copy;
        System.arraycopy(a, 0, copy, 0, a.length);
        return copy;
    }

    @Benchmark
    public int[] forLoop(CopyArray state) {
        int[] a = state.a;
        int arraySize = a.length;
        int[] copy = state.copy;
        for (int i = 0; i < arraySize; i++) {
            copy[i] = a[i];
        }
        return copy;
    }

    @Benchmark
    public int[] unsafeCopyMemory(CopyArray state) {
        int[] a = state.a;
        int arraySize = a.length;
        int[] copy = state.copy;
        U.copyMemory(a, INT_BASE, copy, INT_BASE, arraySize << INT_SCALE_SHIFT);
        return copy;
    }
}

结果:

Benchmark                       (arraySize)  Mode  Samples     Score     Error  Units
t.CopyArray.arrayCopy                     0  avgt       10     3,598 ▒   0,385  ns/op
t.CopyArray.arrayCopy                     1  avgt       10     7,566 ▒   0,961  ns/op
t.CopyArray.arrayCopy                    10  avgt       10     8,629 ▒   0,988  ns/op
t.CopyArray.arrayCopy                    16  avgt       10     9,994 ▒   0,667  ns/op
t.CopyArray.arrayCopy                  1000  avgt       10   164,613 ▒  19,103  ns/op
t.CopyArray.arrayCopy                  1024  avgt       10   320,658 ▒  26,458  ns/op
t.CopyArray.arrayCopy                  8192  avgt       10  2468,847 ▒ 204,341  ns/op
t.CopyArray.forLoop                       0  avgt       10     2,598 ▒   0,194  ns/op
t.CopyArray.forLoop                       1  avgt       10     4,161 ▒   0,841  ns/op
t.CopyArray.forLoop                      10  avgt       10    10,056 ▒   1,166  ns/op
t.CopyArray.forLoop                      16  avgt       10    11,004 ▒   1,477  ns/op
t.CopyArray.forLoop                    1000  avgt       10   207,118 ▒  36,371  ns/op
t.CopyArray.forLoop                    1024  avgt       10   206,291 ▒  26,327  ns/op
t.CopyArray.forLoop                    8192  avgt       10  1867,073 ▒ 238,488  ns/op
t.CopyArray.unsafeCopyMemory              0  avgt       10     7,080 ▒   0,082  ns/op
t.CopyArray.unsafeCopyMemory              1  avgt       10     8,257 ▒   0,184  ns/op
t.CopyArray.unsafeCopyMemory             10  avgt       10     8,424 ▒   0,365  ns/op
t.CopyArray.unsafeCopyMemory             16  avgt       10    10,129 ▒   0,076  ns/op
t.CopyArray.unsafeCopyMemory           1000  avgt       10   213,239 ▒  30,729  ns/op
t.CopyArray.unsafeCopyMemory           1024  avgt       10   310,881 ▒  34,527  ns/op
t.CopyArray.unsafeCopyMemory           8192  avgt       10  2419,456 ▒  66,557  ns/op

结论:

  • Unsafe.copyMemory永远不是一个选择。
  • 当数组大小为2的幂时,for循环优于System.arraycopy
  • 否则,您需要对特定的矩阵行宽进行额外的研究。