我是MPI的初学者,这段代码似乎会产生分段错误。
int luDecomposeP(double *LU, int n)
{
int i, j, k;
int sendcount, recvcount, remaining, rank, numProcs, status;
double *row, *rowFinal, *start, factor;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
row = (double *)malloc(n*sizeof(double));
rowFinal = (double *)malloc(n*n*sizeof(double));
for(i=0; i<n-1; i++)
{
if(rank == 0)
{
status = pivot(LU,i,n);
for(j=0; j<n; j++)
row[j] = LU[n*i+j];
}
MPI_Bcast(&status, 1, MPI_INT, 0, MPI_COMM_WORLD);
if(status == -1)
return -1;
MPI_Bcast(row, n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
sendcount = (n-i-1)/numProcs;
recvcount = (n-i-1)/numProcs;
remaining = (n-i-1)%numProcs;
if(rank == 0)
start = LU + n*(i+1);
else
start = NULL;
MPI_Scatter(start, sendcount*n, MPI_DOUBLE, rowFinal, recvcount*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
for(j=0; j<recvcount; j++)
{
factor = rowFinal[n*j+i]/row[i];
for(k=i+1; k<n; k++)
rowFinal[n*j+k] -= row[k]*factor;
rowFinal[n*j+i] = factor;
}
MPI_Gather(rowFinal, recvcount*n, MPI_DOUBLE, start, sendcount*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
if(rank == 0)
{
int ctr = 0;
while(ctr<remaining)
{
int index = sendcount*numProcs + ctr + i + 1;
factor = LU[n*index+i]/row[i];
for(k=i+1; k<n; k++)
LU[n*index+k] -= row[k]*factor;
LU[n*index+i] = factor;
ctr++;
}
}
}
free(row);
free(rowFinal);
return 0;
}
此代码导致分段错误。我读了很多答案并试图解决它 但那并没有发生。 我读到了解除引用NULL指针的问题,我使用一个名为start 的指针修复了它。但是细分错误仍然会出现。
错误:
[sheshnag:32334] *处理收到的信号*
[sheshnag:32334]信号:分段错误(11)
[sheshnag:32334]信号代码:未映射的地址(1)
[sheshnag:32334]地址失败:0x44000098
[sheshnag:32334] [0] /lib/libpthread.so.0(+0xf8f0)[0x2b082eafe8f0]
[sheshnag:32334] [1] /usr/lib/openmpi/lib/libmpi.so.0(MPI_Comm_rank+0x5e)[0x2b082d5ff6ee]
[sheshnag:32334] [2] ./libluDecompose.so(luDecomposeP+0x2f)[0x2b082d17ea2f]
[sheshnag:32334] [3] _tmp / bench.mpi.exe(main + 0x2e7)[0x40b61d]
[sheshnag:32334] [4] /lib/libc.so.6(__libc_start_main+0xfd)[0x2b082ed2ac4d]
[sheshnag:32334] [5] _tmp / bench.mpi.exe()[0x40ac49]
答案 0 :(得分:1)
从您报告的堆栈跟踪中,似乎在MPI_Comm_rank()
的调用中发生了分段错误。
我看到两个可能的问题:
MPI_Init()
失踪。通常,MPI明确报告它已丢失,但您的MPI实施可能导致崩溃?在任何其他MPI调用之前必须调用MPI_Init()
(并且在退出之前必须调用MPI_Finalize()
)。
破坏了MPI安装。简单的MPI“hello world”程序是否正常工作?
哦,是的......第三种选择:
luDecomposeP()
之前的指令):MPI_Comm_rank()
是写入堆栈变量的第一个操作。