我处理块对角矩阵(每个块具有相同的大小),当我使用illegal address error
动态分配的2D数组时,我有private
...
// NB is the number of block
// N is the block size
// A is the main matrix (block diagonal)
double** B; // a block
B = new double*[N];
for (unsigned int i = 0; i < N; i++)
B[i] = new double[N];
#pragma acc parallel loop private(B[:N][:N]) copyin(A[:NB*N][:NB*N])
for (unsigned int b = 0; b < NB; b++) {
#pragma acc loop
for (unsigned int i = 0; i < N; i++) {
#pragma acc loop
for (unsigned int j = 0; j < N; j++) {
B[i][j] = A[b*N+i][b*N+j];
}
}
// process B
}
for (unsigned int i = 0; i < N; i++)
delete[] B[i];
delete[] B;
我得到的错误是:
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
如果我将数组展平为一维数组并使用词典索引或静态二维数组但是我使用需要double**
作为参数的函数,它可以正常工作,所以我更喜欢坚持使用动态二维数组...
我已经阅读了规范中的private
条款,但它没有说不支持动态2D数组,所以我想我做错了什么......
答案 0 :(得分:3)
很抱歉,不支持在private子句中使用指针数组。问题是编译器运行时必须为每个组,工作器或向量动态创建一个私有(取决于具有private子句的循环)并填写所有设备指针。这将带来极高的管理费用。
如果“B”是固定大小的静态数组,“double B [N] [N]”,那么你可以在private子句中使用它。
否则,我建议通过添加第三维来手动对阵列进行私有化。
// NB is the number of block
// N is the block size
// A is the main matrix (block diagonal)
double*** B; // a block
B = new double**[NB];
for (unsigned int i = 0; i < NB; i++) {
B[i] = new double*[N];
for (unsigned int j = 0; j < N; j++) {
B[i][j] = new double[N];
}}
#pragma acc parallel loop create(B[:NB][:N][:N]) copyin(A[:NB*N][:NB*N])
for (unsigned int b = 0; b < NB; b++) {
#pragma acc loop
for (unsigned int i = 0; i < N; i++) {
#pragma acc loop
for (unsigned int j = 0; j < N; j++) {
B[b][i][j] = A[b*N+i][b*N+j];
}
}
// process B
}
for (unsigned int i = 0; i < NB; i++) {
for (unsigned int j = 0; j < N; j++) {
delete[] B[i][j];
}
delete[] B[i];
}
delete[] B;