Question

我正在尝试将线程合并到我的项目中，但是存在一个问题，即仅使用1个工作线程会使该线程永久“入睡”。也许我有比赛条件，但根本没注意到。

我的PeriodicThreads对象维护线程的集合。一旦PeriodicThreads::exec_threads()被调用，线程将被通知，唤醒并执行其任务。之后，他们重新入睡。

这样的工作线程的功能：

void PeriodicThreads::threadWork(size_t threadId){
    //not really used, but need to decalre to use conditional_variable:
    std::mutex mutex;
    std::unique_lock<std::mutex> lck(mutex);

    while (true){
        // wait until told to start working on a task:
        while (_thread_shouldWork[threadId] == false){
            _threads_startSignal.wait(lck);
        }

        thread_iteration(threadId);    //virtual function

        _thread_shouldWork[threadId] = false;   //vector of flags
        _thread_doneSignal.notify_all();

    }//end while(true) - run until terminated externally or this whole obj is deleted 
}

如您所见，每个线程都在监视标志向量中的每个条目，一旦发现其标志为真-执行任务，然后重置其标志。

这是可以唤醒所有线程的函数：

std::atomic_bool _threadsWorking =false;

//blocks the current thread until all worker threads have completed:
void PeriodicThreads::exec_threads(){
    if(_threadsWorking ){ 
        throw std::runtime_error("you requested exec_threads(), but threads haven't yet finished executing the previous task!");
    }

    _threadsWorking = true;//NOTICE: doing this after the exception check.

    //tell all threads to unpause by setting their flags to 'true'
    std::fill(_thread_shouldWork.begin(),  _thread_shouldWork.end(),  true);
    _threads_startSignal.notify_all();

    //wait for threads to complete:

    std::mutex mutex;
    std::unique_lock<std::mutex> lck(mutex); //lock & mutex are not really used.

    auto isContinueWaiting = [&]()->bool{
        bool threadsWorking = false; 
        for (size_t i=0;  i<_thread_shouldWork.size();  ++i){
            threadsWorking |= _thread_shouldWork[i];
        }
        return threadsWorking;
    };

    while (isContinueWaiting()){
        _thread_doneSignal.wait(lck);
    }

    _threadsWorking = false;//set atomic to false 
}

调用exec_threads()可以很好地进行数百次或很少的几千次连续迭代。调用从主线程的while循环中进行。它的工作线程处理任务，重置其标志，然后进入睡眠状态，直到下一个exec_threads()，依此类推。

但是，此后的一段时间，程序陷入了“休眠”状态，似乎暂停了，但没有崩溃。

在这种“休眠”状态下，在我的condition_variables的任何while-loop处设置断点实际上不会触发该断点。

偷偷摸摸地创建了自己的验证线程（与main平行）并监视我的{{1}}对象。当它进入休眠状态时，我的验证线程一直向控制台输出当前没有线程在运行的信息（PeriodicThreads的{{1}}原子被永久设置为false）。但是，在其他测试期间，一旦“休眠问题”开始，原子仍为_threadsWorking。

奇怪的是，如果我在重置其标志之前强迫PeriodicThreads睡眠至少10微秒，则一切正常，并且不会发生“休眠”。否则，如果我们允许线程非常快速地完成任务，则可能会导致整个问题。

我将每个true包装在while循环中，以防止虚假唤醒触发转换，以及在PeriodicThreads::run_thread被调用之前调用condition_variable的情况。 {{3}}

注意，即使我只有1个工作线程，也会发生这种情况

可能是什么原因？

修改

放弃这些矢量标志，仅在具有1个工作线程的单个notify_all上进行测试，仍然显示相同的问题。

Answer 1

所有共享数据均应使用互斥保护。互斥锁应具有（至少）与共享数据相同的作用域。

您的_thread_shouldWork容器是共享数据。您可以创建一个互斥量的全局数组，每个互斥量可以保护自己的_thread_shouldWork元素。（请参阅下面的注释）。您还应该至少具有与互斥量一样多的条件变量。（您可以将1个互斥锁与几个不同的条件变量一起使用，但不应将多个不同的互斥锁与1个条件变量一起使用。）

condition_variable应该保护 actual 条件（在这种情况下，_thread_shouldWork的单个元素在任何给定点的状态），并且互斥量用于保护包含该条件的变量。

如果您只是使用随机的本地互斥锁（就像在线程代码中一样），或者根本不使用互斥锁（在主代码中），那么所有选择都将关闭。这是未定义的行为。尽管我大多数时候都可以看到它的运行效果（运气）。我怀疑正在发生的是工作线程缺少主线程发出的信号。也可能是您的主线程缺少来自工作线程的信号。（线程A读取状态并进入while循环，然后线程B更改状态并发送通知，然后线程A进入睡眠状态...等待已发送的通知）

具有本地范围的互斥体是一个危险信号！

注意：如果使用向量，则必须提防，因为添加或删除项目可能会触发调整大小，该大小会在不首先抓住互斥体的情况下触摸元素（因为向量并不了解您的互斥体）

使用数组时，您还必须提防虚假共享

编辑：这是@Kari认为有助于解释虚假分享的视频 https://www.youtube.com/watch?v=dznxqe1Uk3E

执行速度过快后，工作线程永久休眠

1 个答案: