我已经提供了在C环境中执行的提供的OpenCL内核,但是当我尝试使用提供的代码使用PyOpenCL运行它时,我收到以下错误:
> Traceback (most recent call last):
> File "integral.py", line 38, in <module>
> example.execute()
> File "integral.py", line 26, in execute
> self.program.integrate_f(self.queue, self.a, None, self.a, self.dest_buf)
> File "/Library/Python/2.7/site-packages/pyopencl-2013.3-py2.7-macosx-10.9-
> x86_64.egg/pyopencl/__init__.py", line 506, in kernel_call
> self.set_args(*args)
> File "/Library/Python/2.7/site-packages/pyopencl-2013.3-py2.7-macosx-10.9-
> x86_64.egg/pyopencl/__init__.py", line 559, in kernel_set_args
> % (i+1, str(e), advice))
> pyopencl.LogicError: when processing argument #1 (1-based): Kernel.set_arg failed: invalid value -
> invalid kernel argument
所以看来我传递的内核是一个无效的参数,但我不知道为什么会抱怨这个。有什么想法吗?
import pyopencl as cl
import numpy
class CL:
def __init__(self):
self.ctx = cl.create_some_context()
self.queue = cl.CommandQueue(self.ctx)
def loadProgram(self, filename):
#read in the OpenCL source file as a string
f = open(filename, 'r')
fstr = "".join(f.readlines())
print fstr
#create the program
self.program = cl.Program(self.ctx, fstr).build()
def popCorn(self, n):
mf = cl.mem_flags
self.a = int(n)
#create OpenCL buffers
self.dest_buf = cl.Buffer(self.ctx, mf.WRITE_ONLY, bumpy.empty(self.a). nbytes)
def execute(self):
self.program.integrate_f(self.queue, self.a, None, self.a, self.dest_buf)
c = numpy.empty_like(self.dest_buf)
cl.enqueue_read_buffer(self.queue, self.dest_buf, c).wait()
print "a", self.a
print "c", c
if __name__ == "__main__":
example = CL()
example.loadProgram("integrate_f.cl")
example.popCorn(1024)
example.execute()
__kernel void integrate_f(const unsigned int n, __global float* c)
{
unsigned int i = get_global_id(0);
float x_i = 0 + i*((2*M_PI_F)/(float)n);
if (x_i != 0 || x_i != 2*M_PI_F)
{
c[i] = exp(((-1)*(x_i*x_i))/(4*(M_PI_F*M_PI_F)));
}
else c[i] = 0;
}
答案 0 :(得分:3)
内核调用有两个错误。与您的回溯相关的错误是self.a
是一个Python int对象,而内核需要一个OpenCL unsigned int,特别是32位。您需要使用(例如)numpy.int32(self.a)
显式传入32位整数。第二个错误是全局工作大小参数需要是一个元组。
因此,内核调用的正确代码应为:
self.program.integrate_f(self.queue, (self.a,), None, numpy.int32(self.a), self.dest_buf)