Question

我有以下结构：

C ++：

struct ss{
    cl_float3 pos;
    cl_float value;
    cl_bool moved;
    cl_bool nextMoved;
    cl_int movePriority;
    cl_int nextMovePriority;
    cl_float value2;
    cl_float value3;
    cl_int neighbors[6];
    cl_float3 offsets[6];
    cl_float off1[6];
    cl_float off2[6];
};

的OpenCL：

typedef struct{
    float3 nextPos;
    float value;
    bool moved;
    bool nextMoved;
    int movePriority;
    int nextMovePriority;
    float value2;
    float value3;
    int neighbors[6];
    float3 offsets[6];
    float off1[6];
    float off2[6];
} ss;

我有一个这样的数组，我将它们传递给opencl缓冲区，但是当我在内核中使用它们时，数据会被破坏。

我相信这是因为对齐，我已经阅读了有关它的其他帖子

I need help understanding data alignment in OpenCL's buffers

Aligning for Memory Accesses in OpenCL/CUDA

但是，我还没有完全理解如何正确地将对齐设置为我的结构。另外，我还不完全理解属性对齐和打包限定符。

所以：

Q1。你能告诉我如何使我的结构正常工作吗？

Q2。你能解释一下我或者给我一些了解所有对齐问题和限定符的链接吗？

感谢名单。

Answer 1

我建议从最广泛的类型到最窄的类型声明您的结构。首先，这避免了由于对齐而浪费的未使用空间。其次，这通常可以避免在不同设备上进行不同对齐的任何麻烦。

所以，

struct ss{
    cl_float3 pos;
    cl_float3 offsets[6];
    cl_float value;
    cl_float value2;
    cl_float value3;
    cl_float off1[6];
    cl_float off2[6];
    cl_int movePriority;
    cl_int nextMovePriority;
    cl_int neighbors[6];
    cl_bool moved;
    cl_bool nextMoved;
};

另外，要注意float3类型;它通常是GPU上的float4，如果主机端布局也没有这样做，那么你的对齐将会关闭。您可以切换到float4以避免这种情况。

如何为OpenCL结构数组设置正确的对齐方式？

1 个答案: