我重写了一个计算库来改进内存管理,并发现这导致了速度的提高。在原文中它使用一个数组,其成员在内存中是12个双精度(如此96个字节),而我的数组是连续的。
这种差异会提高多少速度?
答案 0 :(得分:1)
我创建了一个小型测试程序,用于计算1D和2D阵列的数组元素访问时间。它是在Console
模式下构建的C#.NET中的Release
应用程序(启用了优化)。如果1D数组的大小为m
,那么2D数组的大小为m x m
。
public class PhysicsCalculator
{
public const Int32 ArrayDimension = 1000;
public long CalculateSingleDimensionPerformance()
{
var arr = new double[ArrayDimension];
var stopwatch = new Stopwatch();
stopwatch.Start();
for (Int32 i = 0; i < ArrayDimension; i++)
{
arr[i] = i;
}
stopwatch.Stop();
return stopwatch.ElapsedTicks;
}
public long CalculateDoubleDimensionPerformance()
{
var arr = new double[ArrayDimension, ArrayDimension];
var stopwatch = new Stopwatch();
stopwatch.Start();
for (Int32 i = 0; i < ArrayDimension; i++)
{
arr[i, 5] = i;
}
stopwatch.Stop();
return stopwatch.ElapsedTicks;
}
}
class Program
{
static void Main(string[] args)
{
var physicsCalculator = new PhysicsCalculator();
// This is a dummy call to tell the runtime to jit the methods before hand (to avoid jitting on first call)
physicsCalculator.CalculateSingleDimensionPerformance();
physicsCalculator.CalculateDoubleDimensionPerformance();
Console.WriteLine("Number of ticks per seconds = " + new TimeSpan(0, 0, 1).Ticks);
Console.WriteLine();
const int numberOfRepetetions = 1000;
long elapsedTicks = 0;
for (var i = 0; i < numberOfRepetetions; i++)
{
elapsedTicks += physicsCalculator.CalculateSingleDimensionPerformance();
}
Console.WriteLine("1D array : ");
GenerateReport(elapsedTicks, numberOfRepetetions);
elapsedTicks = 0;
for (var i = 0; i < numberOfRepetetions; i++)
{
elapsedTicks += physicsCalculator.CalculateDoubleDimensionPerformance();
}
Console.WriteLine("2D array : ");
GenerateReport(elapsedTicks, numberOfRepetetions);
// Wait before exit
Console.Read();
}
private static void GenerateReport(long elapsedTicks, int numberOfRepetetions)
{
var oneSecond = new TimeSpan(0, 0, 1);
Console.WriteLine("Array size = " + PhysicsCalculator.ArrayDimension);
Console.WriteLine("Ticks (avg) = " + elapsedTicks / numberOfRepetetions);
Console.WriteLine("Ticks (for {0} repetitions) = {1}", numberOfRepetetions, elapsedTicks);
Console.WriteLine("Time taken (avg) = {0} ms", (elapsedTicks * oneSecond.TotalMilliseconds) / (numberOfRepetetions * oneSecond.Ticks));
Console.WriteLine("Time taken (for {0} repetitions) = {1} ms", numberOfRepetetions,
(elapsedTicks * oneSecond.TotalMilliseconds) / oneSecond.Ticks);
Console.WriteLine();
}
}
我的机器上的结果(2.8 GHz Phenom II四核,8 GB DDR2 800 MHz RAM,Windows 7 Ultimate x64)
Number of ticks per seconds = 10000000
1D array : Array size = 1000
Ticks (avg) = 52
Ticks (for 1000 repetitions) = 52598
Time taken (avg) = 0.0052598 ms
Time taken (for 1000 repetitions) = 5.2598 ms
2D array : Array size = 1000
Ticks (avg) = 13829
Ticks (for 1000 repetitions) = 13829984
Time taken (avg) = 1.3829984 ms
Time taken (for 1000 repetitions) = 1382.9984 ms
有趣的是,结果非常清楚,2D数组元素的访问时间显着大于1D数组元素的访问时间。
确定所花费的时间是否是数组大小的函数
100
Number of ticks per seconds = 10000000 1D array : Array size = 100 Ticks (avg) = 20 Ticks (for 1000 repetitions) = 20552 Time taken (avg) = 0.0020552 ms Time taken (for 1000 repetitions) = 2.0552 ms 2D array : Array size = 100 Ticks (avg) = 326 Ticks (for 1000 repetitions) = 326039 Time taken (avg) = 0.0326039 ms Time taken (for 1000 repetitions) = 32.6039 ms
20
Number of ticks per seconds = 10000000 1D array : Array size = 20 Ticks (avg) = 16 Ticks (for 1000 repetitions) = 16653 Time taken (avg) = 0.0016653 ms Time taken (for 1000 repetitions) = 1.6653 ms 2D array : Array size = 20 Ticks (avg) = 21 Ticks (for 1000 repetitions) = 21147 Time taken (avg) = 0.0021147 ms Time taken (for 1000 repetitions) = 2.1147 ms
12
(您的用例)Number of ticks per seconds = 10000000 1D array : Array size = 12 Ticks (avg) = 16 Ticks (for 1000 repetitions) = 16548 Time taken (avg) = 0.0016548 ms Time taken (for 1000 repetitions) = 1.6548 ms 2D array : Array size = 12 Ticks (avg) = 20 Ticks (for 1000 repetitions) = 20762 Time taken (avg) = 0.0020762 ms Time taken (for 1000 repetitions) = 2.0762 ms
如您所见,数组大小确实会影响元素访问时间。但是,在数组大小为12
的用例中,差异大约是(1D 0.0016548 ms
与2D 0.0020762 ms
相比{}} {%}即1D访问速度提高25%比2D访问。
当2D阵列
在上面的示例中,如果1D数组的大小为25
,那么2D数组的大小为m
。当二维数组的大小减少到m x m
时,我得到m x 2
m = 12
在这种情况下,差异几乎不是1.3%。
为了衡量系统的性能,我建议您在FORTRAN中转换上述代码并使用实际值运行基准测试。