我有一个带数字的矩阵,我想在MPI进程之间分配(我想将矩阵分成多个块,并使每个进程都占有自己的一部分)。您可以在下面看到我对它的评论,以便您理解:
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define MASTER 0
int *data; // pointer to data
int total_elems; // total number of elems
int type_size; // size of the data type (in our case it will be int)
int main (int argc, char *argv[]){
MPI_Status status;
int my_rank,my_size;
int rc = -1;
int chunk;
int i;
type_size = sizeof(int);
if (argc != 2){
printf("usage: %s file_name\n",argv[0]);
exit(1);
}
printf("Using %s as input\n",argv[1]);
total_elems = file_size(argv[1],sizeof(int)); // this function calculates the number of elems in the file/matrix
if (total_elems<0){
printf("Invalid number of elements\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
printf("There are %d elems in the matrix\n",total_elems);
MPI_Init(&argc,&argv); // initialize MPI environment
MPI_Comm_rank(MPI_COMM_WORLD,&my_rank);
MPI_Comm_size(MPI_COMM_WORLD,&my_size);
printf("Number of MPI processes: %d\n", my_size);
chunk = total_elems/my_size; // elems in a chunk, a chunk for every process
printf("Chunk size: %d\n", chunk);
printf("up to here ok 0\n");
if (my_rank == MASTER)
{
// nothing
printf("up to here ok 0.5\n");
// here it is still ok
}else{ // NOW HERE ATTENTION!! here it is still ok
printf("ok?\n"); // does not print this printf, i have a segmentation fault
data = (int *)malloc(chunk*sizeof(int)); // we assign the number of bytes to the chunk
if (data == NULL){
printf("Error in malloc\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
rc = read_from_pos(argv[1], chunk*(my_rank-1), chunk, type_size, (void *)data); // a function that reads from the file from a position a specified number of elements and stores in the buffer data
if (rc<0){
printf("Error reading file\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
}
函数'read_from_pos':
int read_from_pos(char *name, uint pos,uint num_elems,uint type_size,void *buff)
{
int fd,ret=0,pending,ready=0;
fd=open(name,O_RDONLY);
if (fd<0) return -1;
ret=lseek(fd,(pos*type_size),SEEK_SET); // we go to the position
if (ret!=pos*type_size) return -1;
pending=num_elems*type_size; // pending items to be read
ready=0; // read bytes 0 at the moment
while(pending>0){
ret=read(fd,(char *)buff+ready,pending);
if (ret<0) return -1;
pending=pending-ret;
ready=ready+ret;
}
printf("Total number of elements read %u\n",ready/type_size);
close(fd);
return 0;
}
所以,我不明白,为什么在if语句中一切都还好,但是当转到else语句时,当该进程不是主站时,它甚至不打印printf并得到分段错误。我确信printf不会产生分段错误,我使用了“ \ n”,所以我的猜测是malloc,但是我仍然认为可以。我已经为每个进程分配了malloc中块的大小。
所以也许是因为MPI。我是否必须以某种方式保护这段代码?我想为每个从属进程分配一个独立的矩阵块,我弄错了吗?
具有10个数字的矩阵的输出我明白了:
./prog easy.txt
Using easy.txt as input
There are 10 elements in the file
There are 10 elems in the matrix
Number of MPI processes: 1
Chunk size: 10
up to here ok 0
up to here ok 0.5
Segmentation fault