我正在编写一个程序,它将存储在文本文件中的两个矩阵A和B相乘,并且哪个大小可能是变体的,所以我的程序必须识别矩阵A和B的大小,确定它们是否可以相乘等
当我将数据从主进程传递到从进程时,真正的问题不是问题,在我的程序中,我将行从主进程传递给从进程,行数取决于矩阵的行数和进程数。
矩阵A按行存储,但矩阵B按列存储。
matrixA [0] ----------------
matrixA [1] ----------------
matrixA [2] ----------------
matrixB [0] matrixB [1] matrixB [2] .........
| | | | | | | | | | | |
您可以在此处找到文本文件(输入内容):matrixA matrixB。
经过几天80的样式调试(完全不是调试器),我认为问题(我得到的分段错误是输出)是在这些代码行中(来自slave函数):
void slave( int id, int slaves, double **matrixA, double **matrixB, double **matrixC )
{
int type, columnsA, columnsB, rowsA, rowsB, Btype, offset, rows, averageRows, extraRows;
MPI_Status status;
/* Recieves columns of A and B from master. */
type = 3;
MPI_Recv( &columnsA, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
MPI_Recv( &rowsA, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
MPI_Recv( &columnsB, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
MPI_Recv( &rowsB, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
printf( "%d slave recieved ColumnA = %d, RowsA = %d, ColumnB = %d, RowsB = %d.\n", id, columnsA, rowsA, columnsB, rowsB );
/* Recieve from master. */
type = 0;
MPI_Recv( &offset, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
MPI_Recv( &rows, 1, MPI_INT, 0, type, MPI_COMM_WORLD, &status );
matrixAllocate( &matrixA, columnsA, rows );
matrixAllocate( &matrixB, rowsB, columnsB );
matrixAllocate( &matrixC, columnsB, rows );
printf( "Correctly allocated.\n" );
/* This part is only to see if the mem was correctly allocated.*/
for( int i = 0; i < rows; i++ ){
for( int j = 0; j < columnsA; j++)
matrixA[ i ][ j ] = i + j;
}
for( int i = 0; i < columnsB; i++ ){
for( int j = 0; j < rowsB; j++)
matrixB[ i ][ j ] = i * j;
}
if ( id == 1 ){
matrixPrinter( "matrixA", matrixA, rows, columnsA );
matrixBPrinter( "matrixB", matrixB, rowsB, columnsB );
matrixPrinter( "matrixC", matrixC, rows, columnsB );
}
MPI_Recv( &matrixA, ( rows * columnsA ) , MPI_DOUBLE, 0, type, MPI_COMM_WORLD, &status );
MPI_Recv( &matrixB, ( rowsB * columnsB ), MPI_DOUBLE, 0, type, MPI_COMM_WORLD, &status );
printf( "Correctly recieved.\n" );
matrixPrinter( "matrixA", matrixA, rows, columnsA );
matrixBPrinter( "matrixB", matrixB, rowsB, columnsB );
matrixPrinter( "matrixC", matrixC, rows, columnsB );
if ( id == 1 ){
printf( "My id is %d.\n", id );
for ( int i = 0; i < rows; i++ ){
for( int j = 0; j < columnsA; j++ ){
printf( "%lf ", matrixA[ i ][ j ] );
}
printf( "\n" );
}
}
可以在此处找到整个代码。 MPI matrix multiplier in C.
终端的输出为:
答案 0 :(得分:6)
问题是,矩阵的类型为“double **”,如“matrixAllocate”中所分配。在发送和接收数据时,MPI假定buf包含数据连续作为1-d数组,但情况并非如此。(您可以通过打印出每个矩阵条目的地址轻松检查)
我认为这是C中一个着名的陷阱:指针和数组是不同的。如果矩阵是二维数组,那么所有条目都是连续排列的。
我的建议是将矩阵分配为1-d,不要使用multidim下标。
答案 1 :(得分:1)
如果不仔细阅读所有MPI
代码,我讨厌发布这样的答案,但我建议将来使用编译器命令-Wall
。它可能会有所帮助并且会发现这样的错误。对于MPI和任何计算相关的东西,你几乎总是需要-Wall
编译器命令
查看代码中的输出和警告列表。
$ mpic++ test.cpp -Wall -o test
test.cpp:30:63: warning: unused variable 'rank' [-Wunused-variable]
int lineA, lineB, columnA, columnB, id, size, rc, slaves, rank, source;
^
test.cpp:30:69: warning: unused variable 'source' [-Wunused-variable]
int lineA, lineB, columnA, columnB, id, size, rc, slaves, rank, source;
^
test.cpp:126:50: warning: variable 'matrixC' is uninitialized when used here [-Wuninitialized]
slave( id, slaves, matrixA, matrixB, matrixC );
^~~~~~~
test.cpp:34:21: note: initialize the variable 'matrixC' to silence this warning
**matrixC;
^
= NULL
test.cpp:126:41: warning: variable 'matrixB' is uninitialized when used here [-Wuninitialized]
slave( id, slaves, matrixA, matrixB, matrixC );
^~~~~~~
test.cpp:33:21: note: initialize the variable 'matrixB' to silence this warning
**matrixB,
^
= NULL
test.cpp:85:44: warning: variable 'rc' is uninitialized when used here [-Wuninitialized]
MPI_Abort( MPI_COMM_WORLD, rc );
^~
test.cpp:30:53: note: initialize the variable 'rc' to silence this warning
int lineA, lineB, columnA, columnB, id, size, rc, slaves, rank, source;
^
= 0
test.cpp:126:32: warning: variable 'matrixA' is uninitialized when used here [-Wuninitialized]
slave( id, slaves, matrixA, matrixB, matrixC );
^~~~~~~
test.cpp:32:21: note: initialize the variable 'matrixA' to silence this warning
double **matrixA,
^
= NULL
test.cpp:398:20: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixPrinter( "matrixA", matrixA, rows, columnsA );
^
test.cpp:399:21: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixBPrinter( "matrixB", matrixB, rowsB, columnsB );
^
test.cpp:400:20: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixPrinter( "matrixC", matrixC, rows, columnsB );
^
test.cpp:407:20: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixPrinter( "matrixA", matrixA, rows, columnsA );
^
test.cpp:408:21: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixBPrinter( "matrixB", matrixB, rowsB, columnsB );
^
test.cpp:409:20: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
matrixPrinter( "matrixC", matrixC, rows, columnsB );
^
test.cpp:363:70: warning: unused variable 'averageRows' [-Wunused-variable]
int type, columnsA, columnsB, rowsA, rowsB, Btype, offset, rows, averageRows, extraRows;
^
test.cpp:363:83: warning: unused variable 'extraRows' [-Wunused-variable]
int type, columnsA, columnsB, rowsA, rowsB, Btype, offset, rows, averageRows, extraRows;
^
test.cpp:363:49: warning: unused variable 'Btype' [-Wunused-variable]
int type, columnsA, columnsB, rowsA, rowsB, Btype, offset, rows, averageRows, extraRows;
^
15 warnings generated.