处理一些矩阵代码,我担心性能问题。
这是它的工作原理:我有一个IMatrix
抽象类(包含所有矩阵运算等),由ColumnMatrix
类实现。
abstract class IMatrix
{
public int Rows {get;set;}
public int Columns {get;set;}
public abstract float At(int row, int column);
}
class ColumnMatrix : IMatrix
{
private data[];
public override float At(int row, int column)
{
return data[row + columns * this.Rows];
}
}
这个类在我的应用程序中经常使用,但我关注性能问题。 测试只针对相同大小的锯齿状阵列读取2000000x15矩阵,对于矩阵访问,我获得1359ms用于阵列访问9234ms:
public void TestAccess()
{
int iterations = 10;
int rows = 2000000;
int columns = 15;
ColumnMatrix matrix = new ColumnMatrix(rows, columns);
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
matrix[i, j] = i + j;
float[][] equivalentArray = matrix.ToRowsArray();
TimeSpan totalMatrix = new TimeSpan(0);
TimeSpan totalArray = new TimeSpan(0);
float total = 0f;
for (int iteration = 0; iteration < iterations; iteration++)
{
total = 0f;
DateTime start = DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
total = matrix.At(i, j);
totalMatrix += (DateTime.Now - start);
total += 1f; //Ensure total is read at least once.
total = total > 0 ? 0f : 0f;
start = DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
total = equivalentArray[i][j];
totalArray += (DateTime.Now - start);
}
if (total < 0f)
logger.Info("Nothing here, just make sure we read total at least once.");
logger.InfoFormat("Average time for a {0}x{1} access, matrix : {2}ms", rows, columns, totalMatrix.TotalMilliseconds);
logger.InfoFormat("Average time for a {0}x{1} access, array : {2}ms", rows, columns, totalArray.TotalMilliseconds);
Assert.IsTrue(true);
}
所以我的问题是:我怎样才能让这件事变得更快?有什么方法可以让我的ColumnMatrix.At更快? 干杯!
答案 0 :(得分:4)
abstract class IMatrix
。这是错误的,因为它不是接口,调用重写方法比调用final(又称非修饰符方法)慢。答案 1 :(得分:3)
如果二维数组的表现要好得多,那么你是不是在你的类的内部存储中使用二维数组,而不是使用计算索引的开销的一维数组?
答案 2 :(得分:3)
您编写的数组代码可以轻松优化,因为您可以按顺序访问内存。这意味着JIT编译器可能会在将其转换为本机代码方面做得更好,这将带来更好的性能。
另一件事你没有考虑的是,如果你的At方法,内联仍然会被击中和遗漏为什么不使用索引器属性,顺便说一句?)没有内联,由于使用调用和堆栈操作,你将遭受巨大的性能损失。最后你应该考虑密封ColumnMatrix类,因为这样可以使JIT编译器的优化更加容易(调用肯定比callvirt更好)。
答案 3 :(得分:2)
当您使用DateTime.Now
来衡量效果时,结果非常随机。时钟的分辨率类似于1/20秒,因此您不是测量实际时间,而是测量时钟恰好在代码中的位置。
您应该使用Stopwatch
类,它具有更高的分辨率。
答案 4 :(得分:1)
对于元素的每次访问,都要进行乘法运算:row + columns * this.Rows。 你可能会在内部看到你也可以使用二维数组
你还可以获得额外的开销,即在课堂上抽象出来的东西。每次访问矩阵中的元素时,您都在进行额外的方法调用
答案 5 :(得分:1)
更改为:
interface IMatrix
{
int Rows {get;set;}
int Columns {get;set;}
float At(int row, int column);
}
class ColumnMatrix : IMatrix
{
private data[,];
public int Rows {get;set;}
public int Columns {get;set;}
public float At(int row, int column)
{
return data[row,column];
}
}
你最好使用接口而不是抽象类 - 如果你需要它的常用功能,那么为接口添加扩展方法。
2D矩阵也比锯齿状或矩阵更快。
答案 6 :(得分:1)
您可以使用并行编程来加快算法速度。 您可以编译此代码,并比较常规矩阵方程(MultiplyMatricesSequential函数)和并行矩阵方程(MultiplyMatricesParallel函数)的性能。您已经实现了此方法性能的比较函数(在Main函数中)。
您可以在Visual Studio 2010(.NET 4.0)
下编译此代码namespace MultiplyMatrices
{
using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
class Program
{
#region Sequential_Loop
static void MultiplyMatricesSequential(double[,] matA, double[,] matB,
double[,] result)
{
int matACols = matA.GetLength(1);
int matBCols = matB.GetLength(1);
int matARows = matA.GetLength(0);
for (int i = 0; i < matARows; i++)
{
for (int j = 0; j < matBCols; j++)
{
for (int k = 0; k < matACols; k++)
{
result[i, j] += matA[i, k] * matB[k, j];
}
}
}
}
#endregion
#region Parallel_Loop
static void MultiplyMatricesParallel(double[,] matA, double[,] matB, double[,] result)
{
int matACols = matA.GetLength(1);
int matBCols = matB.GetLength(1);
int matARows = matA.GetLength(0);
// A basic matrix multiplication.
// Parallelize the outer loop to partition the source array by rows.
Parallel.For(0, matARows, i =>
{
for (int j = 0; j < matBCols; j++)
{
// Use a temporary to improve parallel performance.
double temp = 0;
for (int k = 0; k < matACols; k++)
{
temp += matA[i, k] * matB[k, j];
}
result[i, j] = temp;
}
}); // Parallel.For
}
#endregion
#region Main
static void Main(string[] args)
{
// Set up matrices. Use small values to better view
// result matrix. Increase the counts to see greater
// speedup in the parallel loop vs. the sequential loop.
int colCount = 180;
int rowCount = 2000;
int colCount2 = 270;
double[,] m1 = InitializeMatrix(rowCount, colCount);
double[,] m2 = InitializeMatrix(colCount, colCount2);
double[,] result = new double[rowCount, colCount2];
// First do the sequential version.
Console.WriteLine("Executing sequential loop...");
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
MultiplyMatricesSequential(m1, m2, result);
stopwatch.Stop();
Console.WriteLine("Sequential loop time in milliseconds: {0}", stopwatch.ElapsedMilliseconds);
// For the skeptics.
OfferToPrint(rowCount, colCount2, result);
// Reset timer and results matrix.
stopwatch.Reset();
result = new double[rowCount, colCount2];
// Do the parallel loop.
Console.WriteLine("Executing parallel loop...");
stopwatch.Start();
MultiplyMatricesParallel(m1, m2, result);
stopwatch.Stop();
Console.WriteLine("Parallel loop time in milliseconds: {0}", stopwatch.ElapsedMilliseconds);
OfferToPrint(rowCount, colCount2, result);
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
#endregion
#region Helper_Methods
static double[,] InitializeMatrix(int rows, int cols)
{
double[,] matrix = new double[rows, cols];
Random r = new Random();
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
matrix[i, j] = r.Next(100);
}
}
return matrix;
}
private static void OfferToPrint(int rowCount, int colCount, double[,] matrix)
{
Console.WriteLine("Computation complete. Print results? y/n");
char c = Console.ReadKey().KeyChar;
if (c == 'y' || c == 'Y')
{
Console.WindowWidth = 180;
Console.WriteLine();
for (int x = 0; x < rowCount; x++)
{
Console.WriteLine("ROW {0}: ", x);
for (int y = 0; y < colCount; y++)
{
Console.Write("{0:#.##} ", matrix[x, y]);
}
Console.WriteLine();
}
}
}
#endregion
}
}