从连续和非连续数组读取的速度

时间:2011-08-19 14:32:11

标签: performance memory

我重写了一个计算库来改进内存管理,并发现这导致了速度的提高。在原文中它使用一个数组,其成员在内存中是12个双精度(如此96个字节),而我的数组是连续的。

这种差异会提高多少速度?

1 个答案:

答案 0 :(得分:1)

我创建了一个小型测试程序,用于计算1D和2D阵列的数组元素访问时间。它是在Console模式下构建的C#.NET中的Release应用程序(启用了优化)。如果1D数组的大小为m,那么2D数组的大小为m x m

public class PhysicsCalculator
{
    public const Int32 ArrayDimension = 1000;

    public long CalculateSingleDimensionPerformance()
    {
        var arr = new double[ArrayDimension];

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        for (Int32 i = 0; i < ArrayDimension; i++)
        {
            arr[i] = i;
        }

        stopwatch.Stop();

        return stopwatch.ElapsedTicks;
    }

    public long CalculateDoubleDimensionPerformance()
    {
        var arr = new double[ArrayDimension, ArrayDimension];

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        for (Int32 i = 0; i < ArrayDimension; i++)
        {
            arr[i, 5] = i;
        }

        stopwatch.Stop();

        return stopwatch.ElapsedTicks;
    }
}

class Program
{
    static void Main(string[] args)
    {
        var physicsCalculator = new PhysicsCalculator();

        // This is a dummy call to tell the runtime to jit the methods before hand (to avoid jitting on first call)
        physicsCalculator.CalculateSingleDimensionPerformance();
        physicsCalculator.CalculateDoubleDimensionPerformance();

        Console.WriteLine("Number of ticks per seconds = " + new TimeSpan(0, 0, 1).Ticks);
        Console.WriteLine();

        const int numberOfRepetetions = 1000;
        long elapsedTicks = 0;

        for (var i = 0; i < numberOfRepetetions; i++)
        {
            elapsedTicks += physicsCalculator.CalculateSingleDimensionPerformance();
        }

        Console.WriteLine("1D array : ");
        GenerateReport(elapsedTicks, numberOfRepetetions);

        elapsedTicks = 0;
        for (var i = 0; i < numberOfRepetetions; i++)
        {
            elapsedTicks += physicsCalculator.CalculateDoubleDimensionPerformance();
        }

        Console.WriteLine("2D array : ");
        GenerateReport(elapsedTicks, numberOfRepetetions);

        // Wait before exit
        Console.Read();
    }

    private static void GenerateReport(long elapsedTicks, int numberOfRepetetions)
    {
        var oneSecond = new TimeSpan(0, 0, 1);

        Console.WriteLine("Array size = " + PhysicsCalculator.ArrayDimension);
        Console.WriteLine("Ticks (avg) = " + elapsedTicks / numberOfRepetetions);
        Console.WriteLine("Ticks (for {0} repetitions) = {1}", numberOfRepetetions, elapsedTicks);
        Console.WriteLine("Time taken (avg) = {0} ms", (elapsedTicks * oneSecond.TotalMilliseconds) / (numberOfRepetetions * oneSecond.Ticks));
        Console.WriteLine("Time taken (for {0} repetitions) = {1} ms", numberOfRepetetions,
           (elapsedTicks * oneSecond.TotalMilliseconds) / oneSecond.Ticks);

        Console.WriteLine();
    }
}

我的机器上的结果(2.8 GHz Phenom II四核,8 GB DDR2 800 MHz RAM,Windows 7 Ultimate x64)

Number of ticks per seconds = 10000000

1D array : Array size = 1000 
Ticks (avg) = 52 
Ticks (for 1000 repetitions) = 52598 
Time taken (avg) = 0.0052598 ms 
Time taken (for 1000 repetitions) = 5.2598 ms

2D array : Array size = 1000 
Ticks (avg) = 13829
Ticks (for 1000 repetitions) = 13829984 
Time taken (avg) = 1.3829984 ms 
Time taken (for 1000 repetitions) = 1382.9984 ms

有趣的是,结果非常清楚,2D数组元素的访问时间显着大于1D数组元素的访问时间。


确定所花费的时间是否是数组大小的函数

  • 数组大小为100
Number of ticks per seconds = 10000000

1D array : 
Array size = 100
Ticks (avg) = 20
Ticks (for 1000 repetitions) = 20552
Time taken (avg) = 0.0020552 ms
Time taken (for 1000 repetitions) = 2.0552 ms

2D array : 
Array size = 100
Ticks (avg) = 326
Ticks (for 1000 repetitions) = 326039
Time taken (avg) = 0.0326039 ms
Time taken (for 1000 repetitions) = 32.6039 ms
  • 数组大小为20
Number of ticks per seconds = 10000000

1D array : 
Array size = 20
Ticks (avg) = 16
Ticks (for 1000 repetitions) = 16653
Time taken (avg) = 0.0016653 ms
Time taken (for 1000 repetitions) = 1.6653 ms

2D array : 
Array size = 20
Ticks (avg) = 21
Ticks (for 1000 repetitions) = 21147
Time taken (avg) = 0.0021147 ms
Time taken (for 1000 repetitions) = 2.1147 ms
  • 数组大小为12(您的用例)
Number of ticks per seconds = 10000000

1D array : 
Array size = 12
Ticks (avg) = 16
Ticks (for 1000 repetitions) = 16548
Time taken (avg) = 0.0016548 ms
Time taken (for 1000 repetitions) = 1.6548 ms

2D array : 
Array size = 12
Ticks (avg) = 20
Ticks (for 1000 repetitions) = 20762
Time taken (avg) = 0.0020762 ms
Time taken (for 1000 repetitions) = 2.0762 ms

如您所见,数组大小确实会影响元素访问时间。但是,在数组大小为12的用例中,差异大约是(1D 0.0016548 ms与2D 0.0020762 ms相比{}} {%}即1D访问速度提高25%比2D访问。


当2D阵列

时,较低的阵列尺寸较小

在上面的示例中,如果1D数组的大小为25,那么2D数组的大小为m。当二维数组的大小减少到m x m时,我得到m x 2

的以下结果
m = 12

在这种情况下,差异几乎不是1.3%。

为了衡量系统的性能,我建议您在FORTRAN中转换上述代码并使用实际值运行基准测试。