我正在学习CUDA,但仍处于初学阶段。我正在尝试一个简单的任务,但是当我运行它时我的代码崩溃了,我不知道为什么。任何帮助将不胜感激。
编辑:在cudaMemcpy
和Image
结构中崩溃,pixelVal
属于int**
类型。这是原因吗?
原始C ++代码:
void Image::reflectImage(bool flag, Image& oldImage)
/*Reflects the Image based on users input*/
{
int rows = oldImage.N;
int cols = oldImage.M;
Image tempImage(oldImage);
for(int i = 0; i < rows; i++)
{
for(int j = 0; j < cols; j++)
tempImage.pixelVal[rows - (i + 1)][j] = oldImage.pixelVal[i][j];
}
oldImage = tempImage;
}
我的CUDA内核&amp;代码:
#define NTPB 512
__global__ void fliph(int* a, int* b, int r, int c)
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
int j = blockIdx.y * blockDim.y + threadIdx.y;
if (i >= r || j >= c)
return;
a[(r - i * c) + j] = b[i * c + j];
}
void Image::reflectImage(bool flag, Image& oldImage)
/*Reflects the Image based on users input*/
{
int rows = oldImage.N;
int cols = oldImage.M;
Image tempImage(oldImage);
if(flag == true) //horizontal reflection
{
//Allocate device memory
int* dpixels;
int* oldPixels;
int n = rows * cols;
cudaMalloc((void**)&dpixels, n * sizeof(int));
cudaMalloc((void**)&oldPixels, n * sizeof(int));
cudaMemcpy(dpixels, tempImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice);
cudaMemcpy(oldPixels, oldImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice);
int nblks = (n + NTPB - 1) / NTPB;
fliph<<<nblks, NTPB>>>(dpixels, oldPixels, rows, cols);
cudaMemcpy(tempImage.pixelVal, dpixels, n * sizeof(int), cudaMemcpyDeviceToHost);
cudaFree(dpixels);
cudaFree(oldPixels);
}
oldImage = tempImage;
}
答案 0 :(得分:1)
您必须创建2D网格才能使用2D索引i
和j
处理图像。在当前情况下,内核仅处理图像的第一行。
要创建2D索引机制,请创建2D块和2D网格,如下所示:
const int BLOCK_DIM = 16;
dim3 Block(BLOCK_DIM,BLOCK_DIM);
dim3 Grid;
Grid.x = (cols + Block.x - 1)/Block.x;
Grid.y = (rows + Block.y - 1)/Block.y;
fliph<<<Grid, Block>>>(dpixels, oldPixels, rows, cols);