索引的MPI_Gather键入原始数据

时间:2015-05-25 09:04:51

标签: c parallel-processing mpi

我在使用MPI_Gather将索引整数收集到整数向量时遇到了问题。当我尝试在不创建新接收类型的情况下收集整数时,出现MPI_ERR_TRUNCATE错误。

*** An error occurred in MPI_Gather
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

复制问题的最小示例是

#include <stdlib.h>
#include "mpi.h"

int i, comm_rank, comm_size, err;
int *send_data, *recv_data;
int *blocklengths, *displacements;
MPI_Datatype send_type;

int main ( int argc, char *argv[] ){
  MPI_Init ( &argc, &argv );
  MPI_Comm_rank(MPI_COMM_WORLD, &comm_rank);
  MPI_Comm_size(MPI_COMM_WORLD, &comm_size);


  unsigned int block = 1000;
  unsigned int count = 1000;

  send_data = malloc(sizeof(int)*block*count);
  for (i=0; i<block*count; ++i) send_data[i] = i;

  recv_data = 0;
  if(comm_rank==0) recv_data = malloc(sizeof(int)*block*count*comm_size);

  blocklengths = malloc(sizeof(int)*count);
  displacements = malloc(sizeof(int)*count);
  for (i=0; i<count; ++i) {
    blocklengths[i] = block;
    displacements[i] = i*block;
  }
  MPI_Type_indexed(count, blocklengths, displacements, MPI_INT, &send_type);
  MPI_Type_commit(&send_type);

  err = MPI_Gather((void*)send_data, 1, send_type, (void*)recv_data, block*count, MPI_INT, 0, MPI_COMM_WORLD);
  if (err) MPI_Abort(MPI_COMM_WORLD, err);

  free(send_data);
  free(recv_data);
  free(blocklengths);
  free(displacements);


  MPI_Finalize ( );
  return 0;
}

我注意到当我使用小于6K字节的数据传输时,不会发生此错误。

我找到了使用MPI_Type_contiguous的解决方法,虽然看起来我为代码添加了额外的开销。

MPI_Type_contiguous(block*count, MPI_INT, &recv_type);
MPI_Type_commit(&recv_type);
err = MPI_Gather((void*)send_data, 1, send_type, (void*)recv_data, 1, recv_type, 0, MPI_COMM_WORLD);

我已验证open-mpi v1.6和v1.8。

中发生的错误

有人能解释这个问题的根源吗?

0 个答案:

没有答案