Question

假设我有这个函数，多个线程需要在一个锁定步骤中运行

std::atomic<bool> go = false;  

void func() {
        while (!go.load()) {} //sync barrier
        ...
}

我想摆脱螺旋锁并将其替换为基于互斥锁的东西，因为我有很多线程在做各种各样的东西并且自动锁定十几个线程对整体吞吐量来说是灾难性的，如果我包含它会运行得更快例如，在螺旋锁内部睡眠（1）。

例如，STL中的某些内容与HLSL中的AllMemoryBarrierWithGroupSync（）类似吗？基本上它只是让每个线程都在屏障上睡觉，直到所有线程都到达它。

Answer 1

如果您愿意使用实验性功能，那么latch或barrier会对您有所帮助。否则，您可以使用conditional_variable或conditional_variable_any和shared_lock（C ++ 17功能）创建自己的类似构造。

使用shared_mutex实施障碍：

#include <condition_variable>
#include <iostream>
#include <mutex>
#include <shared_mutex>
#include <thread>
#include <vector>

std::shared_mutex mtx;
std::condition_variable_any cv;
bool ready = false;

void thread_func()
{
    {
        std::shared_lock<std::shared_mutex> lock(mtx);
        cv.wait(lock, []{return ready;});
    }
    std::cout << '0';
    //Rest of calculations
}


int main()
{
    std::vector<std::thread> threads;
    for(int i = 0; i < 5; ++i)
        threads.emplace_back(thread_func);
    std::this_thread::sleep_for(std::chrono::seconds(1));
    {
        std::unique_lock<std::shared_mutex> lock(mtx);
        std::cout << "Go\n";
        ready = true;
    }
    cv.notify_all();
    for(auto& t: threads)
        t.join();
    std::cout << "\nFinished\n";
}

Answer 2

听起来你想完全条件变量的好处。

bool go = false;
std::mutex mtx;
std::condition_variable cv;

void thread_func()
{
    {
        std::unique_lock<std::mutex> lock(mtx);
        cv.wait(lock, []{ return go; });
    }
    // Do stuff
}

void start_all()
{
    {
        std::unique_lock<std::mutex> lock(mtx);
        go = true;
    }
    cv.notify_all();
}

螺旋锁的替代屏障？

2 个答案: