Question

我正在尝试在pyopencl中学习字符串操作。我在这里找到了一个示例程序，该程序将字符串复制为空字符串-How to pass a list of strings to an opencl kernel using pyopencl? 代码本身有一些错误，我不确定是否能够修复。这是我正在使用的修改后的代码-

import numpy as np
import pyopencl as cl

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
#The kernel uses one workitem per char transfert
prog_str = """__kernel void foo(__global char *in, __global char *out, const int size){
                  int idx = get_global_id(0);
                  if (idx < size){
                      out[idx] = in[idx];
                  }
           }"""

#Note that the type of the array of strings is '|S40' for the length
#of third element is 40, the shape is 3 and the nbytes is 120 (3 * 40)
original_str = np.array(("this is an average string", 
                         "and another one", 
                         "let's push even more with a third string"))
str_size = len(original_str)   
copied_str = np.empty_like(original_str)                      
mf = cl.mem_flags
#length = (str_size+1) * 200
in_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=original_str)
out_buf = cl.Buffer(ctx, mf.WRITE_ONLY, size=copied_str.nbytes)

#here launch the kernel with str_size number of workitems in this case 120
#this mean that some of the workitems won't process any meaningful char 
#(not all string have a lenght of 40) but it's no biggiea
prog = cl.Program(ctx, prog_str).build()
event = prog.foo(queue, original_str.shape , None, in_buf, out_buf, np.int32(120))
event.wait()
cl.enqueue_copy(queue, copied_str, out_buf)
print(original_str) 
print(copied_str)

但是，现在我收到unicode解码错误，无法解决此问题。如果我用google搜索，只会得到转义字符问题所在的主题。

这是错误-

Traceback (most recent call last):
  File "clStringTest.py", line 34, in <module>
    print(copied_str)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 1504, in array_str
    return array2string(a, max_line_width, precision, suppress_small, ' ', "")
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 668, in array2string
    return _array2string(a, options, separator, prefix)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 460, in wrapper
    return f(self, *args, **kwargs)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 495, in _array2string
    summary_insert, options['legacy'])
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 796, in _formatArray
    curr_width=line_width)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 750, in recurser
    word = recurser(index + (-i,), next_hanging_indent, next_width)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 704, in recurser
    return format_function(a[index])
UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 8-11: code point not in range(0x110000)

我设法找到用于整数/浮点运算的示例程序，并且这些程序可以工作。但是我找不到用于字符串操作的有效示例。

如果有人可以帮助我，我将不胜感激。

更新1： 在我的桌面上，也至少在最初出现了unicode错误-

 In [1]: %run clStringTest.py                                                    
    Choose platform:
    [0] <pyopencl.Platform 'NVIDIA CUDA' at 0x5597858ab040>
    [1] <pyopencl.Platform 'Portable Computing Language' at 0x7fb273e39020>
    Choice [0]:0
    Set the environment variable PYOPENCL_CTX='0' to avoid being asked again.
    ['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
     ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
     'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
     ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
     't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
    84
    ---------------------------------------------------------------------------
    UnicodeDecodeError                        Traceback (most recent call last)
    ~/Documents/clStringTest.py in <module>
         30 print(original_str)
         31 print(len(original_str))
    ---> 32 print(copied_str)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in array_str(a, max_line_width, precision, suppress_small)
       1502         return _guarded_str(np.ndarray.__getitem__(a, ()))
       1503 
    -> 1504     return array2string(a, max_line_width, precision, suppress_small, ' ', "")
       1505 
       1506 def set_string_function(f, repr=True):

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in array2string(a, max_line_width, precision, suppress_small, separator, prefix, style, formatter, threshold, edgeitems, sign, floatmode, suffix, **kwarg)
        666         return "[]"
        667 
    --> 668     return _array2string(a, options, separator, prefix)
        669 
        670 

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in wrapper(self, *args, **kwargs)
        458             repr_running.add(key)
        459             try:
    --> 460                 return f(self, *args, **kwargs)
        461             finally:
        462                 repr_running.discard(key)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in _array2string(a, options, separator, prefix)
        493     lst = _formatArray(a, format_function, options['linewidth'],
        494                        next_line_prefix, separator, options['edgeitems'],
    --> 495                        summary_insert, options['legacy'])
        496     return lst
        497 

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in _formatArray(a, format_function, line_width, next_line_prefix, separator, edge_items, summary_insert, legacy)
        794         return recurser(index=(),
        795                         hanging_indent=next_line_prefix,
    --> 796                         curr_width=line_width)
        797     finally:
        798         # recursive closures have a cyclic reference to themselves, which

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        748 
        749             for i in range(trailing_items, 1, -1):
    --> 750                 word = recurser(index + (-i,), next_hanging_indent, next_width)
        751                 s, line = _extendLine(
        752                     s, line, word, elem_width, hanging_indent, legacy)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        702 
        703         if axes_left == 0:
    --> 704             return format_function(a[index])
        705 
        706         # when recursing, add a space to align with the [ added, and reduce the

    UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 0-3: code point not in range(0x110000)

但是，然后我通过miniconda安装了POCL。突然，如果我通过GPU执行该程序，则该程序将运行一半。至少我不再收到unicode错误。

$ python3 clStringTest.py 
Choose platform:
[0] <pyopencl.Platform 'NVIDIA CUDA' at 0x5561780b9f20>
[1] <pyopencl.Platform 'Portable Computing Language' at 0x7f30edb41020>
Choice [0]:0
Set the environment variable PYOPENCL_CTX='0' to avoid being asked again.
(84,)
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
84
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' ''
 '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' ''
 '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '']

奇怪的是，在CPU上执行仍然会给我同样的错误。

这时我很茫然，不得不相信这是一个错误。 @doqtor你怎么看？

更新2： 我尝试查看如果增加工作项的数量和内核的size参数会发生什么。经过一番尝试和错误之后，我终于得到了@doqtor所示的输出，使用了（400，）工作项和400大小。我不知道为什么会这样。

$ python3 clStringTest.py
Choose platform:
[0] <pyopencl.Platform 'NVIDIA CUDA' at 0x55f0f357ef20>
[1] <pyopencl.Platform 'Portable Computing Language' at 0x7fb8c82f6020>
Choice [0]:
Set the environment variable PYOPENCL_CTX='' to avoid being asked again.
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
84
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

现在它也可以在CPU上运行，但是在输出数组被打印后我会得到它-

corrupted size vs. prev_size
Aborted (core dumped)

如果我减少工作项的数量（300及以下）或大小，则会在CPU上再次出现可怕的unicode错误。如上所示，在GPU上我丢失了字符。

Answer 1

在我的环境中运行代码会引起以下问题（我没有遇到Unicode错误问题）：

输入字符串（original_str）：

['this is an average string' 'and another one'
 "let's push even more with a third string"]

输出字符串（copied_str）：

[ 'thiM\x1b\x7f\x00\x00 c\x0fM\x1b\x7f\x00\x00\xd0i\x0fM\x1b\x7f\x00\x00\xe0b\x0fM\x1b\x7f\x00\x00ph\x0fM\x1b\x7f'
 'pa\x0fM\x1b\x7f\x00\x00\xf0f\x0fM\x1b\x7f\x00\x00\x90\xce\x0eM\x1b\x7f\x00\x00\x80l\x0fM\x1b\x7f\x00\x00\x80k\x0fM\x1b\x7f'
 '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\xb8\x0eM\x1b\x7f\x00\x00\xc0\xbd\x0eM\x1b\x7f']

输出中的前3个字符仅是正确的，其余字符是垃圾-这是因为由于将original_str定义为（3，）而将global_size设置为3。如下所述将original_str定义为一维numpy数组就足够了，以解决上述问题：

original_str = np.array(list("this is an average string, and another one, let's push even more with a third string"))

，然后全局大小为（84，），一切都应按预期工作：

输入字符串（original_str）：

['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

输出字符串（copied_str）：

['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

如注释中前面所述，将多维数组传递给OpenCL内核将不起作用。内核只能正确处理一维C样式的数组。

正如@nova所发现的，如果numpy数组具有标记C_CONTIGUOUS : True且可由print(original_str.flags)验证，则numpy字符串数组也将起作用。这样就足以将(original_str.nbytes,)作为global_size传递而无需对原始源代码进行任何其他修改。

pyopencl中基本字符串复制程序中的Unicode解码错误

1 个答案: