Question

我尝试使用glsl计算着色器为32x32x32 3D纹理实现传播方案，如果只使用一次着色器执行x迭代就会非常好。

我有3个纹理，一个是源，一个是目标，第三个是积累所有东西。每次迭代都必须交换源和目标。 PseudoCode看起来像 OpenGL的：

glUseProgram(computeShaderId);
glBindImageTexture(0, srcTexId, 0, GL_TRUE, 0, GL_READ_WRITE, GL_RGBA32F);
glBindImageTexture(1, targetTexId, 0, GL_TRUE, 0, GL_READ_WRITE, GL_RGBA32F);
glBindImageTexture(2, accumulateTexId, 0, GL_TRUE, 0, GL_READ_WRITE, GL_RGBA32F);
glDispatchCompute(32,32,32);

GLSL：

#version 430
layout (local_size_x = 1, local_size_y = 1, local_size_z =1) in;
layout(rgba32f) uniform image3D srcTex;
layout(rgba32f) uniform image3D targetTex;
layout(rgba32f) uniform image3D accumulateTex;

void main() {
  ivec3 currentPos = ivec3(gl_GlobalInvocationID.xyz);

  for (int i=0;i<8;i++){
    //accumulate the values of the 6 neighbours (top,bottom,left,right,front,back)
    //by usind the current sourceTexture
    //this involes  loadImage 
    vec4 neighbourValues=getValuesFrom6Neighbours(currentPos, currentSource);

    storeImage(currentTarget,currentPos,neighbourValues);

    vec4 value=loadImage(accumTex,currentPos);
    storeImage(accumTex,currentPos,neighbourValues+value);

    //the texture are swapped, which I have a solution for so no problem here
    swapSrcAndTarget();

    //here is the Problem how to synchronize all different shader invocations?
    someKindOfBarrier();
  }

事情是我不能在一个工作组中完成所有这一切，因为纹理的大小。它会在一个工作组中，我可以使用barrier（），它会没事的。由于纹理的交换，我需要在下一次迭代再次读取之前更新所有值。有人知道这是否有可能吗？

谢谢马克

Answer 1

正如你所说的一切都不适合活动线程，所以我不相信这是直接可能的，除非你接受会有错误（当你读取的值的一半可能来自更新之前或之后）。换句话说，所有线程必须完成第一次ping才能继续进行pong。由于只有一部分线程是一次物理执行的，所以将传递放在一个循环中是行不通的。

我可以想到两件事。

将问题分解为适合的瓷砖，但在完成内核/发送之前，瓷砖边缘（邻居可能是陈旧的）之间不会进行通信。
实现您自己的调度，并使用原子操作尝试获取任务，直到完成完整的ping（意味着手动同步）。然后在memoryBarrier（）之后转到pong。根据经验，这可能比将glDispatchCompute放在for循环中要慢得多。

可以在一次通话中使用glsl计算着色器进行乒乓传播吗？

1 个答案: