我的程序,我已经设置了
size_t global_item_size = 12000;
size_t local_item_size = 600;
cl_mem arr_M_obj = clCreateBuffer(context, CL_MEM_READ_WRITE, 12000 * sizeof(int), NULL, &ret);
ret = clEnqueueNDRangeKernel(command_queue, kernel, 1, NULL, &global_item_size, &local_item_size, 0, NULL, NULL);
和我的内核:
__kernel void mykernel(__global int* arrM)
{
int n = get_global_id(0);
arrM[n] = n;
}
但我的结果有错误,我使用循环并在从设备复制到主机后打印arrM
,我的结果
arrM[0] = 0
arrM[1] = 0
arrM[2] = 1
arrM[3] = 0
arrM[4] = 2
arrM[5] = 0
arrM[6] = 3
...
arrM[11998] = 5999
arrM[11999] = 0
你能帮我解决一下吗?
答案 0 :(得分:0)
是的,如果您没有在帖子中省略任何内容,那么问题就很容易了 - 您忘了将参数设置为内核 - 缓冲区arr_M_obj。
Take a look here for the function used to set the argument
cl_mem arr_M_obj = clCreateBuffer(context, CL_MEM_READ_WRITE, nmax * NMAX * sizeof(cl_ulong), NULL, &ret);
__kernel void PatternBranching(__global int4* arrlmer, __global int* arrMscore, __global int4* arrM, int k, int l, int n1, int sqelenght, int a1)
PatternBranching(arr_lmer_obj, arr_Mscore_obj, arr_M_obj, k, l, n1, sqelength, a1)
注意到arr_Mscore_obj的声明和签名有什么不好吗?将cl_ulong更改为cl_int