我使用并行HDF5并测试了HDF5组中的一个问题,除了我更改了数据集的大小,并将数据类型更改为两倍。原始代码示例位于:
https://support.hdfgroup.org/ftp/HDF5/examples/parallel/Hyperslab_by_row.c
我使用的代码是:
#include "hdf5.h"
#include "stdlib.h"
#define H5FILE_NAME "SDS_row.h5"
#define DATASETNAME "IntArray"
#define NX 800 /* dataset dimensions */
#define NY 6554
#define RANK 2
int main (int argc, char **argv)
{
/*
* HDF5 APIs definitions
*/
hid_t file_id, dset_id; /* file and dataset identifiers */
hid_t filespace, memspace; /* file and memory dataspace identifiers */
hsize_t dimsf[2]; /* dataset dimensions */
double *data; /* pointer to data buffer to write */
hsize_t count[2]; /* hyperslab selection parameters */
hsize_t offset[2];
hid_t plist_id; /* property list identifier */
herr_t status;
/*
* MPI variables
*/
int mpi_size, mpi_rank;
MPI_Comm comm = MPI_COMM_WORLD;
MPI_Info info = MPI_INFO_NULL;
/*
* Initialize MPI
*/
MPI_Init(&argc, &argv);
MPI_Comm_size(comm, &mpi_size);
MPI_Comm_rank(comm, &mpi_rank);
/*
* Set up file access property list with parallel I/O access
*/
plist_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio(plist_id, comm, info);
/*
* Create a new file collectively and release property list identifier.
*/
file_id = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id);
H5Pclose(plist_id);
/*
* Create the dataspace for the dataset.
*/
dimsf[0] = NX;
dimsf[1] = NY;
filespace = H5Screate_simple(RANK, dimsf, NULL);
/*
* Create the dataset with default properties and close filespace.
*/
dset_id = H5Dcreate(file_id, DATASETNAME, H5T_NATIVE_DOUBLE, filespace,
H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
H5Sclose(filespace);
/*
* Each process defines dataset in memory and writes it to the hyperslab
* in the file.
*/
count[0] = dimsf[0]/mpi_size;
count[1] = dimsf[1];
offset[0] = mpi_rank * count[0];
offset[1] = 0;
memspace = H5Screate_simple(RANK, count, NULL);
/*
* Select hyperslab in the file.
*/
filespace = H5Dget_space(dset_id);
H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, count, NULL);
/*
* Initialize data buffer
*/
data = (double *) malloc(sizeof(double)*count[0]*count[1]);
for (hsize_t i=0; i < count[0]*count[1]; i++) {
data[i] = mpi_rank + 10;
}
/*
* Create property list for collective dataset write.
*/
plist_id = H5Pcreate(H5P_DATASET_XFER);
H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);
status = H5Dwrite(dset_id, H5T_NATIVE_DOUBLE, memspace, filespace,
plist_id, data);
free(data);
/*
* Close/release resources.
*/
H5Dclose(dset_id);
H5Sclose(filespace);
H5Sclose(memspace);
H5Pclose(plist_id);
H5Fclose(file_id);
MPI_Finalize();
return 0;
}
如果我使用并行HDF5进行编译,并使用
mpirun -np 12 ./test
程序被卡住。但是,如果我使用NX = 500,则可以。而且,如果我使用4核,那也可以。我在网上搜索了整个下午的时间,找不到解决方案。有人可以让我知道如何解决此问题,或者这段代码的问题是什么?我使用MacOS,并使用带有gcc9的openmpi编译代码。