如何加载文本文件的行,这些行'长度和读取线的数量,以正确的方式投射它们并将它们传递给GPU?
含义:
输入文字文件:
Line1
Line2
..
LineN
app.cl
#define UP_LIMIT 256
typedef struct {
uint bar; // amount of read lines
} foomatic;
typedef struct {
uint len; // length of line
uchar s[UP_LIMIT]; // line
} foo;
__kernel foobar(__global foo * kernel_in, __global foomatic kernel_in2){
// do something
}
的main.cpp
#define LLEN 256
typedef struct {
cl_uint bar; // amount of read lines
} foomatic;
typedef struct {
cl_uint len; // length of line
cl_uchar s[LLEN]; // line
} foo;
int main(){
// load from file
// count lines of file
foo * input = new foo[N]; // N is the amount of lines of source text file
// cast line to cl_uchar
// pass lines, their lengths and number of lines to ocl kernel
delete [] input;
}
答案 0 :(得分:0)
在OpenCL中使用struct似乎很棘手,因为可以在CPU和设备上以不同的方式打包字段。此外,它在GPU上效率不高,因为它会影响对内存访问进行分组的能力。更好的方法是使用多个数组。我有这样的事情:
//N is the number of strings.
cl_uint* strlens = new cl_uint[N];
cl_uchar* input = new cl_uchar[N * LLEN];
cl_int err_code = CL_SUCCESS;
//Remember to check error codes, omitted here for convenience.
cl_mem strlens_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY, N*sizeof(cl_uint), NULL, &err_code);
cl_mem input_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY, N*LLEN*sizeof(cl_uchar), NULL, &err_code);
//Some initialisation code...
//Send to device.
err_code = clEnqueueWriteBuffer(queue, strlens_buffer, CL_TRUE, 0, N*sizeof(cl_uint), strlens, 0, NULL, NULL);
err_code = clEnqueueWriteBuffer(queue, input_buffer, CL_TRUE, 0, N*LLEN*sizeof(cl_uchar), 0 NULL, NULL);
//Send work to the GPU...
//Clean up.
delete[] strlens;
delete[] input;
我已经将上下文和队列用于与您的设备关联的OpenCL上下文以及与该上下文关联的命令队列。