GLSL SpinLock只是大部分工作

时间:2012-08-05 21:04:32

标签: concurrency glsl


不幸的是,自旋锁有时无法防止错误 - 你可以看到很少的白色斑点,特别是在第四层。在第二层的太空船的机翼上也有一个。这些斑点每帧都有所不同。

enter image description here

在我的GLSL自旋锁中,当要绘制片段时,片段程序以原子方式读取锁定值并将其写入单独的锁定纹理,等待0显示,表示锁定已打开。 In practice,我发现程序必须是并行的,因为如果两个线程在同一个像素上,则warp无法继续(一个必须等​​待,而另一个继续,并且GPU线程warp中的所有线程必须同时执行)。


#version 420 core

//locking texture
layout(r32ui) coherent uniform uimage2D img2D_0;
//data texture, also render target
layout(RGBA32F) coherent uniform image2D img2D_1;

//Inserts "new_data" into "data", a sorted list
vec4 insert(vec4 data, float new_data) {
    if      (new_data<data.x) return vec4(      new_data,;
    else if (new_data<data.y) return vec4(data.x,new_data,data.yz);
    else if (new_data<data.z) return vec4(data.xy,new_data,data.z);
    else if (new_data<data.w) return vec4(,new_data      );
    else                      return data;

void main() {
    ivec2 coord = ivec2(gl_FragCoord.xy);

    //The idea here is to keep looping over a pixel until a value is written.
    //By looping over the entire logic, threads in the same warp aren't stalled
    //by other waiting threads.  The first imageAtomicExchange call sets the
    //locking value to 1.  If the locking value was already 1, then someone
    //else has the lock, and can_write is false.   If the locking value was 0,
    //then the lock is free, and can_write is true.  The depth is then read,
    //the new value inserted, but only written if can_write is true (the
    //locking texture was free).  The second imageAtomicExchange call resets
    //the lock back to 0.

    bool have_written = false;
    while (!have_written) {
        bool can_write = (imageAtomicExchange(img2D_0,coord,1u) != 1u);


        vec4 depths = imageLoad(img2D_1,coord);
        depths = insert(depths,gl_FragCoord.z);

        if (can_write) {
            have_written = true;



    discard; //Already wrote to render target with imageStore


2 个答案:

答案 0 :(得分:3)

作为参考,这里是已经过测试的锁定代码,可用于Nvidia驱动程序314.22&amp; GTX670上的320.18。请注意,如果将代码重新排序或重写为逻辑上等效的代码,则会触发现有的编译器优化错误(请参阅下面的注释。)请注意下面我使用无绑定图像引用。

// sem is initialized to zero
coherent uniform layout(size1x32) uimage2D sem;

void main(void)
    ivec2 coord = ivec2(gl_FragCoord.xy);

    bool done = false;
    uint locked = 0;
     // locked = imageAtomicCompSwap(sem, coord, 0u, 1u); will NOT work
        locked = imageAtomicExchange(sem, coord, 1u);
        if (locked == 0)


            imageAtomicExchange(sem, coord, 0u);

            // replacing this with a break will NOT work
            done = true;


答案 1 :(得分:2)
