我正努力修改一个程序,该程序将两个文件作为输入(每个代表一个向量)并计算它们之间的点积。它应该是并行完成的,但有人告诉我,每个文件中的点数可能无法被可用处理器的数量整除,每个进程可能会从文件中的错误位置读取。我的意思是,如果有四个处理器,可能会正确读取和计算前250个点,但第二个处理器可能会读取相同的250个点并提供不正确的结果。这是我到目前为止所做的。我注意到任何修改。
#include "fstream"
#include "stdlib.h"
#include "stdio.h"
#include "iostream"
#include "mpi.h"
int main(int argc, char *argv[]){
MPI_Init(&argc, argv);
//parse command line arguments
if( argc < 3 || argc > 3){
std::cout << "*** syntax: " << argv[0] << " vecFile1.txt vecFile2.txt" << std::endl;
return(0);
}
//get input file names
std::string vecFile1(argv[1]);
std::string vecFile2(argv[2]);
//open file streams
std::ifstream vecStream1(vecFile1.c_str());
std::ifstream vecStream2(vecFile2.c_str());
//check that streams opened properly
if(!vecStream1.is_open() || !vecStream2.is_open()){
std::cout << "*** Could not open Files ***" << std::endl;
return(0);
}
//if files are open read their lengths and make sure they are compatible
long vecLength1 = 0; vecStream1 >> vecLength1;
long vecLength2 = 0; vecStream2 >> vecLength2;
if( vecLength1 != vecLength2){
std::cout << "*** Vectors are no the same length ***" << std::endl;
return(0);
}
int numProc; //New variable for managing number of processors
MPI_Comm_size(&numProc,MPI_COMM_WORLD); //Added line to obtain number of processors
int subDomainSize = (vecLength1+numProc-1)/numProc; //Not sure if this is correct calculation; meant to account for divisibility with remainders
//read in the vector components and perform dot product
double dotSum = 0.;
for(long i = 0; i < subDomainSize; i++){ //Original parameter used was vecLength1; subDomainSize used instead for each process
double ind1 = 0.; vecStream1 >> ind1;
double ind2 = 0.; vecStream2 >> ind2;
dotSum += ind1*ind2;
}
std::cout << "VECTOR DOT PRODUCT: " << dotSum << std::endl;
MPI_Finalize();
}
除了这些变化,我不知道从哪里开始。我该怎样做才能使用两个文本文件作为输入并行处理来正确计算两个向量的点积?每个包含100000个点,因此手动修改文件是不切实际的。
答案 0 :(得分:1)
我不会在这里编写代码,因为它似乎是一个分配问题,但我会尝试给你一些提示,以便朝着正确的方向前进。
r
的处理器将向量r*subdomainsize
处理为(r+1)*subdomainsize - 1
。vectorlength/numProc
作为subdomainsize
。每个处理器都可以处理subdomainsize
个元素,但最后一个处理器(rank == numProc)
将处理剩余的元素。