如何使用pyopencl将参数传递给ocl内核?

时间:2013-07-13 11:45:09

标签: opencl pyopencl

如何传递一些参数,这些参数将在.cl文件中作为预处理器使用pyopencl进行定义?

含义:

foo.cl

# define LIMIT 12
typedef struct {
    uint i[LIMIT];
} foomatic;

转向

foo_nodefs.cl

typedef struct {
    uint i[LIMIT]; // python script passing LIMIT to set it
} foomatic;

谢谢,

约翰

1 个答案:

答案 0 :(得分:3)

编辑:扩展答案,使其最详细。

有两种方法可以做到:

  1. (元编程)使用源代码将预处理程序指令直接添加到字符串,甚至使用某些模板引擎运行自己的预处理程序。

    import pyopencl as cl
    import numpy
    import numpy.linalg as la
    
    a = numpy.random.rand(50000).astype(numpy.float32)
    b = numpy.random.rand(50000).astype(numpy.float32)
    
    ctx = cl.create_some_context()
    queue = cl.CommandQueue(ctx)
    
    mf = cl.mem_flags
    a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
    b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
    dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)
    
    defines = """
        #define AXIS 0
        #define COEFF 1
        """
    
    prg = cl.Program(ctx,
        defines +
        """
        __kernel void sum(__global const float *a,
        __global const float *b, __global float *c)
        {
          int gid = get_global_id(AXIS);
          c[gid] = a[gid] + b[gid] + COEFF;
        }
        """).build()
    
    prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)
    
    a_plus_b = numpy.empty_like(a)
    cl.enqueue_copy(queue, a_plus_b, dest_buf)
    
    print(la.norm(a_plus_b - (a+b+1)), la.norm(a_plus_b))
    
  2. (C-way)使用options Program.build关键字将构建选项直接传递给clBuildProgram()

    import pyopencl as cl
    import numpy
    import numpy.linalg as la
    
    a = numpy.random.rand(50000).astype(numpy.float32)
    b = numpy.random.rand(50000).astype(numpy.float32)
    
    ctx = cl.create_some_context()
    queue = cl.CommandQueue(ctx)
    
    mf = cl.mem_flags
    a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
    b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
    dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)
    
    prg = cl.Program(ctx, """
        __kernel void sum(__global const float *a,
        __global const float *b, __global float *c)
        {
          int gid = get_global_id(AXIS);
          c[gid] = a[gid] + b[gid] + COEFF;
        }
        """).build(options=['-D', 'AXIS=0', '-D', 'COEFF=1'])
    
    prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)
    
    a_plus_b = numpy.empty_like(a)
    cl.enqueue_copy(queue, a_plus_b, dest_buf)
    
    print(la.norm(a_plus_b - (a+b+1)), la.norm(a_plus_b))
    
  3. (我使用了PyOpenCL文档主页中的修改源代码。在pyopencl 2013.1上测试过。)