我使用多线程和顺序算法实现了两个通用矩阵的加法。我用两个包含实数(双精度)的大矩阵(2000x2000)测试了我的程序,结果非常好。操作成功完成得非常快。后来,我实现了一个表示复数的类,并尝试用两个矩阵重复相同的场景,我发现即使对于两个50x50的矩阵,完成整个过程也需要一定的时间。为了延长执行时间我该怎么办?
这是创建线程的方法(首先,我创建了两个一维数组,以便更轻松地为每个线程提供其起点和终点):
template<typename T, typename Func>
Matrix<T> *calculateLinearDistribution(Matrix<T> *matrix1,
Matrix<T> *matrix2,
Func operation,
int nThreads) {
const int n = matrix1->getN(), m = matrix2->getM(), totalNumbers = n * m;
Matrix<T> *result = new Matrix<T>(n, m);
T *matrix1Unidim = new T[totalNumbers];
T *matrix2Unidim = new T[totalNumbers];
convertMatrixToUnidimensionalArray(matrix1, matrix1Unidim);
convertMatrixToUnidimensionalArray(matrix1, matrix2Unidim);
if (totalNumbers < nThreads) {
nThreads = totalNumbers;
}
const int quantityPerThread = totalNumbers / nThreads;
int rest = totalNumbers % nThreads;
int start = 0, end = 0;
std::vector<std::thread> threads;
std::chrono::milliseconds startTime = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch());
for (int i = 0; i < nThreads; i++) {
end += quantityPerThread;
if (rest > 0) {
end++;
rest--;
}
threads.push_back(std::thread(MultithreadedMethods<T, Func>::linearElementsDistribution, &matrix1Unidim[0],
&matrix2Unidim[0], result, start, end, operation));
start = end;
}
for (int i = 0; i < nThreads; i++) {
threads[i].join();
}
std::chrono::milliseconds endTime = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch());
std::ofstream out(linearElemensStatisticsFile, std::ios_base::app);
std::chrono::milliseconds time = endTime - startTime;
out << "Dimensiune matrice: " << matrix1->getN() << "x" << matrix1->getM()
<< " | Nr. threads: " << nThreads << " | Timp de executie: " << time.count() << std::endl;
out.close();
delete[] matrix1Unidim;
delete[] matrix2Unidim;
return result;
}
这是提供给线程的函数:
template<typename T, typename Func>
void MultithreadedMethods<T, Func>::linearElementsDistribution(T *matrix1,
T *matrix2,
Matrix<T> *result,
int start,
int end,
Func operation) {
const int m = result->getM();
for (int i = start; i < end; i++) {
result->getElements()[i / m][i % m] = operation(matrix1[i], matrix2[i]);
}
}
这是我使用实数运行过程的时间(非常快):
Matrix<double> *linearDistributionResult = calculateLinearDistribution(matrix1,
matrix2,
[](double a, double b) {
return a +
b;
}, nThreads);
最后,这是我尝试使用复数的最糟糕的部分,与顺序结果相比,它要花很多时间甚至失败...
Matrix<ComplexNumber> *linearDistributionResult = calculateLinearDistribution(matrix1,
matrix2,
[](ComplexNumber a,
ComplexNumber b) {
return ComplexNumber(
a.getRealComponent() +
b.getRealComponent(),
a.getImaginaryComponent() +
b.getImaginaryComponent());
}, nThreads);
这当然是顺序实现(我想指出的是,与实数相比,当我使用复数时这也非常慢):
template<typename T, typename Func>
Matrix<T> *calculateSequentialResult(Matrix<T> *matrix1,
Matrix<T> *matrix2,
Func operation) {
const int n = matrix1->getN(), m = matrix1->getM();
Matrix<T> *result = new Matrix<T>(n, m);
std::chrono::milliseconds startTime = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch());
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
result->getElements()[i][j] = operation(matrix1->getElements()[i][j], matrix2->getElements()[i][j]);
}
}
std::chrono::milliseconds endTime = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch());
std::ofstream out(sequentialElementsStatistics, std::ios_base::app);
std::chrono::milliseconds time = endTime - startTime;
out << "Dimensiune matrice: " << matrix1->getN() << "x" << matrix1->getM()
<< " | Nr. threads: 1 | Timp de executie: " << time.count() << std::endl;
out.close();
return result;
}
ComplexNumber类:
class ComplexNumber {
private:
double realComponent;
double imaginaryComponent;
public:
ComplexNumber() {}
ComplexNumber(const ComplexNumber &complexNumber);
double getRealComponent() const;
ComplexNumber(double realComponent, double imaginaryComponent);
void setRealComponent(double realComponent);
double getImaginaryComponent() const;
void setImaginaryComponent(double imaginaryComponent);
friend std::ostream &operator<<(std::ostream &os, const ComplexNumber &complexNumber);
};
and the definition:
double ComplexNumber::getRealComponent() const {
return realComponent;
}
void ComplexNumber::setRealComponent(double realComponent) {
ComplexNumber::realComponent = realComponent;
}
double ComplexNumber::getImaginaryComponent() const {
return imaginaryComponent;
}
void ComplexNumber::setImaginaryComponent(double imaginaryComponent) {
ComplexNumber::imaginaryComponent = imaginaryComponent;
}
ComplexNumber::ComplexNumber(double realComponent, double imaginaryComponent) : realComponent(realComponent),
imaginaryComponent(imaginaryComponent) {
}
ComplexNumber::ComplexNumber(const ComplexNumber &complexNumber) {
this->imaginaryComponent = complexNumber.imaginaryComponent;
this->realComponent = complexNumber.realComponent;
}
std::ostream &operator<<(std::ostream &os, const ComplexNumber &complexNumber) {
if (complexNumber.imaginaryComponent == 0) {
os << std::to_string(complexNumber.realComponent);
} else if (complexNumber.realComponent == 0) {
os << std::to_string(complexNumber.imaginaryComponent) + "i";
} else
os << std::to_string(complexNumber.realComponent) + ((complexNumber.imaginaryComponent < 0) ?
("-" + std::to_string(complexNumber.imaginaryComponent) +
"i") :
("+" + std::to_string(complexNumber.imaginaryComponent) +
"i"));
return os;
}
问题是我使用正则表达式从文件中解析了复数,而且它们非常慢。更换它们后,我设法获得正确的行为。
答案 0 :(得分:0)
重写:
struct ComplexNumber {
double real; // *maybe* = 0
double imaginary; // *maybe* = 0
ComplexNumber( double r, double i ):real(r), imaginary(i) {}
ComplexNumber() = default;
ComplexNumber(const ComplexNumber &complexNumber) = default;
ComplexNumber& operator=(const ComplexNumber &complexNumber) = default;
};
std::ostream &operator<<(std::ostream &os, const ComplexNumber &complexNumber);
<<
可能很慢,不需要成为朋友。停止使用访问器(尤其是非不可访问的访问器)来访问您的字段。
如果您确实需要访问器,请至少将它们内联并放在标题中。但是在这里,它们毫无意义。
即使我不需要operator+
之类的东西,我也会写它们,因为为什么呢?
struct ComplexNumber {
double real; // *maybe* = 0
double imaginary; // *maybe* = 0
ComplexNumber( double r, double i ):real(r), imaginary(i) {}
ComplexNumber() = default;
ComplexNumber(const ComplexNumber &complexNumber) = default;
ComplexNumber& operator=(const ComplexNumber &complexNumber) = default;
ComplexNumber& operator+=( ComplexNumber const& o )& {
real += o.real;
imaginary += o.imaginary;
return *this;
}
ComplexNumber& operator-=( ComplexNumber const& o )& {
real -= o.real;
imaginary -= o.imaginary;
return *this;
}
ComplexNumber& operator*=( ComplexNumber const& o )& {
ComplexNumber r{ real*o.real - imaginary*o.imaginary, real*o.imaginary + imaginary*o.real };
*this = r;
return *this;
}
friend ComplexNumber operator+( ComplexNumber lhs, ComplexNumber const& rhs ) {
lhs += rhs;
return lhs;
}
friend ComplexNumber operator-( ComplexNumber lhs, ComplexNumber const& rhs ) {
lhs -= rhs;
return lhs;
}
friend ComplexNumber operator*( ComplexNumber lhs, ComplexNumber const& rhs ) {
lhs *= rhs;
return lhs;
}
};
这是脑残的样板,但至少没有这些,我无法证明拥有ComplexNumber
类型。 (我遗漏了/
,因为关于如何处理被零除的重要决定仍然存在。)
无论如何,一旦我们不再隐藏从工作代码中访问数据的方式,优化器现在就有机会进行实际优化。