这里没有很多代码,但是实际的问题可能出在python模块生成的C / OpenCl代码中。
编译器会产生许多重复:
<program source>:819:47: warning: comparison of integers of different signs: 'unsigned long' and 'const psc_index_type' (aka 'const int')
if (psc_K * psc_LID_0 + psc_k < psc_offset_end)
~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
使用的代码如下:
import pyopencl as cl
import pyopencl.array
import pyopencl.algorithm
import numpy as np
platform = cl.get_platforms()
my_devices = platform[0].get_devices(device_type=cl.device_type.GPU)
ctx = cl.Context(devices=my_devices)
queue = cl.CommandQueue(ctx)
aryary = np.array([[10, 11, 12, 13, 14, 15, 16, 17], [1, 2, 3, 4, 0, 0, 0, 0], [108, 0, 0, 0, 0, 0, 0, 0]], np.int32)
cl_aryary = cl.array.to_device(queue, aryary)
lenary = np.array([8, 4, 1], np.int32)
cl_lenary = cl.array.to_device(queue, lenary)
result = cl.algorithm.copy_if(
cl_aryary,
"sum_array(&ary[i], len[i]) == 108",
extra_args=[('len', cl_lenary)],
preamble='''
int sum_array(__global int *a, int num_elements);
int sum_array(__global int *a, int num_elements)
{
int i, sum=0;
for (i=0; i<num_elements; i++)
{
sum = sum + a[i];
}
return(sum);
}
''',
queue=queue
)
print(result)
我已经在这里尝试了很多事情,但是无法找到阻碍该代码编译,运行和实际产生结果的原因。