Question

我需要将一个复杂的数据类型作为缓冲区传递给OpenCL，我希望（如果可能的话）避免缓冲区对齐。

在OpenCL中，我需要使用两个结构来区分缓冲区转换中传递的数据：

typedef struct
{
   char a;
   float2 position;
} s1;

typedef struct
{
   char a;
   float2 position;
   char b;
} s2;

我以这种方式定义内核：

__kernel void 
Foo(
   __global const void* bufferData,
   const int amountElements // in the buffer
)
{
   // Now I cast to one of the structs depending on an extra value
   __global s1* x = (__global s1*)bufferData;

}

只有在我对齐缓冲区中传递的数据时它才能正常工作。

问题是：有没有办法使用 _ 属性 _（（打包））或 _ 属性 _（（对齐（1）））以避免缓冲区中传递的数据对齐？

Answer 1

如果填充较小的结构不是一个选项，我建议传递另一个参数让你的内核函数知道类型是什么 - 可能只是元素的大小。

由于您的数据类型为9和10个字节，因此根据您在内核中读取的数量，可能需要尝试将它们填充为12个字节。

您可能感兴趣的其他内容是扩展名：cl_khr_byte_addressable_store http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/cl_khr_byte_addressable_store.html

更新：我没有意识到你传递的是混合阵列，我认为它的类型是统一的。如果要基于每个元素跟踪类型，则应传递类型（或代码）的列表。在bufferData中单独使用float2也可能更快。

__kernel void 
Foo(
   __global const float2* bufferData,
   __global const char* bufferTypes,
   const int amountElements // in the buffer
)

在OpenCL中避免数据对齐

1 个答案: