编辑:我为大家道歉。我实际上想说“多维数组”时使用了“锯齿状数组”这个术语(如下面的例子所示)。我为使用错误的名字道歉。我实际上发现锯齿状阵列比多维阵列更快!我已经为锯齿状阵列添加了测量值。
我今天尝试使用锯齿状的多维数组,当时我注意到它的性能并不像我预期的那样。使用单维数组和手动计算索引要比使用2D数组快得多(几乎两倍)。我使用1024*1024
数组(初始化为随机值)编写了一个测试,进行了1000次迭代,我在我的机器上得到了以下结果:
sum(double[], int): 2738 ms (100%)
sum(double[,]): 5019 ms (183%)
sum(double[][]): 2540 ms ( 93%)
这是我的测试代码:
public static double sum(double[] d, int l1) {
// assuming the array is rectangular
double sum = 0;
int l2 = d.Length / l1;
for (int i = 0; i < l1; ++i)
for (int j = 0; j < l2; ++j)
sum += d[i * l2 + j];
return sum;
}
public static double sum(double[,] d) {
double sum = 0;
int l1 = d.GetLength(0);
int l2 = d.GetLength(1);
for (int i = 0; i < l1; ++i)
for (int j = 0; j < l2; ++j)
sum += d[i, j];
return sum;
}
public static double sum(double[][] d) {
double sum = 0;
for (int i = 0; i < d.Length; ++i)
for (int j = 0; j < d[i].Length; ++j)
sum += d[i][j];
return sum;
}
public static void Main() {
Random random = new Random();
const int l1 = 1024, l2 = 1024;
double[ ] d1 = new double[l1 * l2];
double[,] d2 = new double[l1 , l2];
double[][] d3 = new double[l1][];
for (int i = 0; i < l1; ++i) {
d3[i] = new double[l2];
for (int j = 0; j < l2; ++j)
d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
}
//
const int iterations = 1000;
TestTime(sum, d1, l1, iterations);
TestTime(sum, d2, iterations);
TestTime(sum, d3, iterations);
}
进一步研究表明,第二种方法的IL比第一种方法大23%。 (代码大小68比52)这主要是由于呼叫System.Array::GetLength(int)
。编译器还为锯齿状的多维数组发出Array::Get
的调用,而它只是为简单数组调用ldelem
。
所以我想知道,为什么通过多维数组访问比普通数组更慢?我会假设编译器(或JIT)会做类似于我在第一种方法中所做的事情,但事实并非如此。
你能不能帮助我理解为什么会发生这种情况?
更新:根据Henk Holterman的建议,以下是TestTime
的实施:
public static void TestTime<T, TR>(Func<T, TR> action, T obj,
int iterations)
{
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
action(obj);
Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}
public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1,
T2 obj2, int iterations)
{
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
action(obj1, obj2);
Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}
答案 0 :(得分:42)
下限为0的单维数组与IL内的多维或非0下限数组的类型不同(vector
vs array
IIRC)。使用vector
更简单 - 要获取元素x,您只需执行pointer + size * x
。对于array
,您必须为单维数组执行pointer + size * (x-lower bound)
,并为您添加的每个维度执行更多算术运算。
基本上,CLR针对更常见的情况进行了优化。
答案 1 :(得分:9)
数组边界检查?
单维数组有一个可以直接访问的长度成员 - 编译时这只是一个内存读取。
多维数组需要GetLength(int dimension)方法调用,该方法调用处理参数以获取该维度的相关长度。这不会编译为内存读取,因此您可以进行方法调用等。
此外,GetLength(int dimension)将对参数进行边界检查。
答案 2 :(得分:4)
有趣的是,我从上面运行了以下代码 在Vista盒子上使用VS2008 NET3.5SP1 Win32, 在释放/优化差异几乎是不可测量的, 调试/ noopt多dim数组要慢得多。 (我运行了三次测试以减少第二组的JIT影响。)
Here are my numbers:
sum took 00:00:04.3356535
sum took 00:00:04.1957663
sum took 00:00:04.5523050
sum took 00:00:04.0183060
sum took 00:00:04.1785843
sum took 00:00:04.4933085
查看第二组三个数字。 差别不足以让我在单维数组中编码所有内容。
虽然我没有发布它们,但在Debug / unoptimized中的多维度与 单/锯齿确实产生了巨大的差异。
完整计划:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
namespace single_dimension_vs_multidimension
{
class Program
{
public static double sum(double[] d, int l1) { // assuming the array is rectangular
double sum = 0;
int l2 = d.Length / l1;
for (int i = 0; i < l1; ++i)
for (int j = 0; j < l2; ++j)
sum += d[i * l2 + j];
return sum;
}
public static double sum(double[,] d)
{
double sum = 0;
int l1 = d.GetLength(0);
int l2 = d.GetLength(1);
for (int i = 0; i < l1; ++i)
for (int j = 0; j < l2; ++j)
sum += d[i, j];
return sum;
}
public static double sum(double[][] d)
{
double sum = 0;
for (int i = 0; i < d.Length; ++i)
for (int j = 0; j < d[i].Length; ++j)
sum += d[i][j];
return sum;
}
public static void TestTime<T, TR>(Func<T, TR> action, T obj, int iterations)
{
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
action(obj);
Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}
public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1, T2 obj2, int iterations)
{
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
action(obj1, obj2);
Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}
public static void Main() {
Random random = new Random();
const int l1 = 1024, l2 = 1024;
double[ ] d1 = new double[l1 * l2];
double[,] d2 = new double[l1 , l2];
double[][] d3 = new double[l1][];
for (int i = 0; i < l1; ++i)
{
d3[i] = new double[l2];
for (int j = 0; j < l2; ++j)
d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
}
const int iterations = 1000;
TestTime<double[], int, double>(sum, d1, l1, iterations);
TestTime<double[,], double>(sum, d2, iterations);
TestTime<double[][], double>(sum, d3, iterations);
TestTime<double[], int, double>(sum, d1, l1, iterations);
TestTime<double[,], double>(sum, d2, iterations);
TestTime<double[][], double>(sum, d3, iterations);
}
}
}
答案 3 :(得分:3)
因为多维数组只是一个语法糖,因为它实际上只是一个具有一些索引计算魔力的平面数组。另一方面,锯齿状数组就像是一个数组数组。使用二维数组,访问元素只需要读取一次内存,而使用两级锯齿状数组,则需要读取内存两次。
编辑:显然原版海报将“锯齿状阵列”与“多维阵列”混为一谈,所以我的推理并不完全正确。出于真正的原因,请查看Jon Skeet上面的重型炮兵答案。
答案 4 :(得分:2)
Jagged数组是类引用的数组(其他数组)直到叶子数组,可能是基本类型的数组。因此,为每个其他阵列分配的内存可以到处都是。
而mutli-dimensional数组的内存分配在一个连续的块中。
答案 5 :(得分:2)
最快的速度取决于您的阵列大小。
// * Summary *
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.997 (1909/November2018Update/19H2)
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.302
[Host] : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT
.NET Core 3.1 : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT
Job=.NET Core 3.1 Runtime=.NET Core 3.1
| Method | D | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------- |----- |----------------:|--------------:|--------------:|-----------:|----------:|----------:|-----------:|
| 'double[D1][D2]' | 10 | 376.2 ns | 7.57 ns | 12.00 ns | 0.3643 | - | - | 1144 B |
| 'double[D1, D2]' | 10 | 325.5 ns | 3.71 ns | 3.47 ns | 0.2675 | - | - | 840 B |
| 'double[D1][D2]' | 50 | 4,821.4 ns | 44.71 ns | 37.34 ns | 6.8893 | - | - | 21624 B |
| 'double[D1, D2]' | 50 | 5,834.1 ns | 64.35 ns | 60.20 ns | 6.3629 | - | - | 20040 B |
| 'double[D1][D2]' | 100 | 19,124.4 ns | 230.39 ns | 454.77 ns | 26.2756 | 0.7019 | - | 83224 B |
| 'double[D1, D2]' | 100 | 23,561.4 ns | 299.18 ns | 279.85 ns | 24.9939 | - | - | 80040 B |
| 'double[D1][D2]' | 500 | 1,248,458.7 ns | 11,241.19 ns | 10,515.01 ns | 322.2656 | 160.1563 | - | 2016025 B |
| 'double[D1, D2]' | 500 | 966,940.8 ns | 5,694.46 ns | 5,326.60 ns | 303.7109 | 303.7109 | 303.7109 | 2000034 B |
| 'double[D1][D2]' | 1000 | 8,987,202.8 ns | 97,133.16 ns | 90,858.41 ns | 1421.8750 | 578.1250 | 265.6250 | 8032582 B |
| 'double[D1, D2]' | 1000 | 3,628,421.3 ns | 72,240.02 ns | 177,206.01 ns | 179.6875 | 179.6875 | 179.6875 | 8000036 B |
| 'double[D1][D2]' | 1500 | 26,496,994.4 ns | 380,625.25 ns | 356,037.09 ns | 3406.2500 | 1500.0000 | 531.2500 | 18048064 B |
| 'double[D1, D2]' | 1500 | 12,417,733.7 ns | 243,802.76 ns | 260,866.22 ns | 156.2500 | 156.2500 | 156.2500 | 18000038 B |
| 'double[D1][D2]' | 3000 | 86,943,097.4 ns | 485,339.32 ns | 405,280.31 ns | 12833.3333 | 7000.0000 | 1333.3333 | 72096325 B |
| 'double[D1, D2]' | 3000 | 57,969,405.9 ns | 393,463.61 ns | 368,046.11 ns | 222.2222 | 222.2222 | 222.2222 | 72000100 B |
// * Hints *
Outliers
MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 1 outlier was removed (449.71 ns)
MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 2 outliers were removed, 3 outliers were detected (4.75 us, 5.10 us, 5.28 us)
MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 13 outliers were removed (21.27 us..30.62 us)
MultidimensionalArrayBenchmark.'double[D1, D2]': .NET Core 3.1 -> 1 outlier was removed (4.19 ms)
MultidimensionalArrayBenchmark.'double[D1, D2]': .NET Core 3.1 -> 3 outliers were removed, 4 outliers were detected (11.41 ms, 12.94 ms..13.61 ms)
MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 2 outliers were removed (88.68 ms, 89.27 ms)
// * Legends *
D : Value of the 'D' parameter
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
Gen 0 : GC Generation 0 collects per 1000 operations
Gen 1 : GC Generation 1 collects per 1000 operations
Gen 2 : GC Generation 2 collects per 1000 operations
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
1 ns : 1 Nanosecond (0.000000001 sec)
[SimpleJob(BenchmarkDotNet.Jobs.RuntimeMoniker.NetCoreApp31)]
[MemoryDiagnoser]
public class MultidimensionalArrayBenchmark {
[Params(10, 50, 100, 500, 1000, 1500, 3000)]
public int D { get; set; }
[Benchmark(Description = "double[D1][D2]")]
public double[][] JaggedArray() {
var array = new double[D][];
for (int i = 0; i < array.Length; i++) {
var subArray = new double[D];
array[i] = subArray;
for (int j = 0; j < subArray.Length; j++) {
subArray[j] = j + i * 10;
}
}
return array;
}
[Benchmark(Description = "double[D1, D2]")]
public double[,] MultidimensionalArray() {
var array = new double[D, D];
for (int i = 0; i < D; i++) {
for (int j = 0; j < D; j++) {
array[i, j] = j + i * 10;
}
}
return array;
}
}
答案 6 :(得分:1)
我认为它有一些事情要做,因为锯齿状数组实际上是数组数组,因此有两个级别的间接来获取实际数据。
答案 7 :(得分:1)
我和其他所有人在一起
我有一个带有三维数组的程序,让我告诉你,当我将数组移动到二维时,我看到一个巨大的提升,然后我转移到一维数组。
最后,我认为我在执行时间内看到了超过500%的性能提升。
唯一的缺点是增加了复杂性,以找出一维数组中的内容,而不是三维数组。
答案 8 :(得分:1)
我认为多维度较慢,运行时必须检查两个或更多(三维和向上)边界检查。
答案 9 :(得分:-1)
检查边界。如果“i”小于l1,则“j”变量可能超过l2。这在第二个例子中是不合法的