将Mat传递给OpenCL内核会导致分段错误

时间:2017-06-01 06:53:47

标签: opencv image-processing segmentation-fault opencl mat

我想将OpenCL Mat传递给FGPA的自写OpenCL内核(不支持OpenCV OpenCL)。

主持人代码:

Mat img = imread( "template.jpg", IMREAD_GRAYSCALE );
Mat output(img.rows, img.cols, CV_8UC1);
// Program, Context already declared
// Create Kernel
cl_kernel kernel = NULL;
kernel = clCreateKernel(program, "copy", &status);
// Create Command Queue and associate it with the device you want to execute on
cl_command_queue cmdQueue;
cmdQueue = clCreateCommandQueue(context,devices[0], 0,  &status);

// Buffer, prob i do something wrong here
cl_mem buffer_img = clCreateBuffer(context,CL_MEM_READ_ONLY, sizeof(uint) * img.cols * img.rows,    NULL,&status);
cl_mem buffer_outputimg = clCreateBuffer(context,CL_MEM_WRITE_ONLY, sizeof(uint) * img.cols * img.rows,NULL,&status);

status = clEnqueueWriteBuffer(cmdQueue, buffer_img,CL_FALSE,0,sizeof(uint) * img.cols * img.rows,&img,0,NULL,NULL);
// set kernel arguments
status = clSetKernelArg(kernel,0,sizeof(cl_mem),&buffer_img);
status = clSetKernelArg(kernel,1,sizeof(cl_mem),&buffer_outputimg);

size_t globalWorkSize[2];
globalWorkSize[0] = img.cols;
globalWorkSize[1] = img.rows;
status = clEnqueueNDRangeKernel(cmdQueue,kernel,2,NULL, globalWorkSize, NULL,0, NULL,NULL);
clEnqueueReadBuffer(cmdQueue,buffer_outputimg,CL_TRUE,0,sizeof(uint) * img.cols * img.rows, &output,    0,  NULL,   NULL);

//stop cpu till queue is finish
clFinish(cmdQueue);

内核代码:

__kernel void copy(__global  uchar *  input, __global  uchar *  output) 
{
    const int x = get_global_id(0);
    const int y = get_global_id(1);
    //copy
    output[y * get_global_size(0) + x] = input[y * get_global_size(0) + x] ;
}

当在FPGA上使用它时,我会遇到分段故障,这可能是由于OpenCV Mat的错误处理。

修改: api55建议的编辑主机代码解决了这个问题:

Mat img = imread( "scene.jpg", IMREAD_GRAYSCALE );
Mat output(img.rows, img.cols, CV_8UC1);
// Program, Context already declared
// Create Kernel
cl_kernel kernel = NULL;
kernel = clCreateKernel(program, "copy", &status);
// Create Command Queue and associate it with the device you want to execute on
cl_command_queue cmdQueue;
cmdQueue = clCreateCommandQueue(context,devices[0], 0,  &status);
checkError(status, "Failed to create commadnqueue");

// Buffer
cl_mem buffer_img = clCreateBuffer(context,CL_MEM_READ_ONLY, sizeof(uchar) * img.cols * img.rows,   NULL,&status);
cl_mem buffer_outputimg = clCreateBuffer(context,CL_MEM_WRITE_ONLY, sizeof(uchar) * img.cols * img.rows,NULL,&status);
checkError(status, "Failed to create buffer_mask");

status = clEnqueueWriteBuffer(cmdQueue, buffer_img,CL_FALSE,0,sizeof(uchar) * img.cols * img.rows,img.data,0,NULL,NULL);
checkError(status, "Failed to enqueue buffer_img");


status = clSetKernelArg(kernel,0,sizeof(cl_mem),&buffer_img);
status = clSetKernelArg(kernel,1,sizeof(cl_mem),&buffer_outputimg);

size_t globalWorkSize[2];
globalWorkSize[0] = img.cols;
globalWorkSize[1] = img.rows;
status = clEnqueueNDRangeKernel(cmdQueue,kernel,2,NULL, globalWorkSize, NULL,0, NULL,NULL);
clEnqueueReadBuffer(cmdQueue,buffer_outputimg,CL_TRUE,0,sizeof(uchar) * img.cols * img.rows, output.data,0,NULL,NULL);

imwrite("output.jpg", output);

1 个答案:

答案 0 :(得分:4)

我对opencl没有多少经验,但我认为这是一个opencv / c ++问题。

opencv mat数据位于img.data,其大小为uchar* sizeof(T) * channels * rows * cols

通常,T在加载图像时是uchar,而通道是3(除非是灰度img)。 3通道uchar是每像素24位,灰度(当你加载时)是每像素8位,你使用的是大小为32位的uint。在某些时候,它将超出内存并执行分段错误。此外,如果您不在结构中使用数据指针,您可能正在复制标题信息,只复制指向数据的指针,而不是数据本身。

我建议您更改&img

status = clEnqueueWriteBuffer(cmdQueue, buffer_img,CL_FALSE,0,sizeof(uint) * img.cols * img.rows,&img,0,NULL,NULL);

img.data

最后,您需要拥有正确的数据。我不确定opencl是否可以使用uchar,但如果不能,请将cv::Mat更改为另一种类型:

img.convertTo(img, CV_32S);

加载图片后。这会将其更改为int ... opencv不支持带有unsigned int的矩阵...只需确保在其他位置(即sizeof(uint))相应地更改它,如果转换输入,记得用相同的类型创建输出。

如果您更喜欢浮动,请使用CV_32F,如果您想要加倍CV_64F