Why some threads don't receive pthread_cond_broadcast?

时间:2018-03-09 19:07:05

标签: c pthreads signals handler broadcast

I have a threadpool of workers. Each worker executes this routine:

void* worker(void* args){
  ...
  pthread_mutex_lock(&mtx);

  while (queue == NULL && stop == 0){
    pthread_cond_wait(&cond, &mtx);
  }

  el = pop(queue);
  pthread_mutex_unlock(&mtx);

  ...
}

main thread:

int main(){

   ...
   while (stop == 0){
     ...
     pthread_mutex_lock(&mtx);  
     insert(queue, el);
     pthread_cond_signal(&cond);
     pthread_mutex_unlock(&mtx);
     ...
   }
...
}

Then I have a signal handler that executes this code when it receives a signal:

void exit_handler(){
    stop = 1;   
    pthread_mutex_lock(&mtx);
    pthread_cond_broadcast(&cond);
    pthread_mutex_unlock(&mtx); 
}

I have omitted declarations and initialization, but the original code has them.

After a signal is received most of the time it's all ok, but sometimes it seems that some worker threads stay in the wait loop because they don't see that the variable stop is changed and/or they are not waken up by the broadcast.

So the threads never end. What I am missing?

EDIT: stop=1 moved inside the critical section in exit_handler. The issue remains.

EDIT2: I was executing the program on a VM with Ubuntu. Since the code appears to be totally right I tried to change VM and OS (XUbuntu) and now it seems to work correctly. Still don't know why, anyone has an idea?

1 个答案:

答案 0 :(得分:2)

有些猜测在这里,但评论太长了,所以如果这是错误的,我会删除。我想你可能对pthread_cond_broadcast如何运作有误解(至少我过去曾被烧过的东西)。来自man page

  

pthread_cond_broadcast()函数将取消阻止所有线程   当前在指定的条件变量cond。

上被阻止

好的,有意义的是,_broadcast唤醒了当前在cond上阻止的所有线程。 然而,只有一个被唤醒的线程才能在他们全部唤醒后锁定互斥锁。同样来自手册页:

  

未阻止的主题应根据互联网竞争互斥   调度策略(如果适用),就像每个人都已调用一样   pthread_mutex_lock()

所以这意味着如果在cond上阻塞了3个线程并且调用了_broadcast,则所有3个线程都将被唤醒,但只有1个线程可以获取互斥锁。另外2个仍然会卡在pthread_cond_wait中,等待信号。因此,他们看不到stop设置为1,而exit_handler(我假设一个Ctrl + c软件信号?)是完成信号,所以剩下的线程是失去_broadcast比赛陷入困境,等待一个永远不会到来的信号,并且无法读出stop标志已被设置。

我认为有两种方法可以解决这个问题:

  1. 使用pthread_cond_timedwait。即使没有发出信号,这也会在指定的时间间隔内等待,看到stop == 1,然后退出。
  2. pthread_cond_signal功能的末尾添加pthread_cond_broadcastworker。这样,在线程退出之前,它将发出cond变量的信号,允许任何其他等待线程获取互斥锁并完成处理。如果没有线程在等待它,则发信号通知条件变量没有坏处,所以即使对于最后一个线程也应该没问题。
  3. 编辑:这是一个证明(据我所知)我上面的答案是错误的MCVE,嘿。一旦我按下Ctrl + c,程序立即退出"立即",它告诉我所有线程在广播后快速获取互斥锁,看到stop为假,并退出。然后main加入线程并进行处理。

    #include <stdio.h>
    #include <stdlib.h>
    #include <pthread.h>
    #include <stdbool.h>
    #include <signal.h>
    #include <unistd.h>
    
    
    #define NUM_THREADS 3
    #define STACK_SIZE 10
    
    pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
    pthread_cond_t c = PTHREAD_COND_INITIALIZER;
    volatile bool stop = false;
    int stack[STACK_SIZE] = { 0 };
    int sp = 0; // stack pointer,, also doubles as the current stack size
    
    void SigHandler(int sig)
    {
      if (sig == SIGINT)
      {
        stop = true;
      }
      else
      {
        printf("Received unexcepted signal %d\n", sig);
      }
    }
    
    void* worker(void* param)
    {
      long tid = (long)(param);
      while (stop == false)
      {
        // acquire the lock
        pthread_mutex_lock(&m);
        while (sp <= 0)  // sp should never be < 0
        {
          // there is no data in the stack to consume, wait to get signaled
          // this unlocks the mutex when it is called, and locks the
          // mutex before it returns
          pthread_cond_wait(&c, &m);
        }
    
        // when we get here we should be guaranteed sp >= 1
        printf("thread %ld consuming stack[%d] = %d\n", tid, sp-1, stack[sp-1]);
        sp--;
    
        pthread_mutex_unlock(&m);
    
        int sleepVal = rand() % 10;
        printf("thread %ld sleeping for %d seconds...\n", tid, sleepVal);
        sleep(sleepVal);
      }
      pthread_exit(NULL);
    }
    
    int main(void)
    {
      pthread_t threads[NUM_THREADS];
      pthread_attr_t attr;
    
      pthread_attr_init(&attr);
      pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
    
      srand(time(NULL));
    
      for (long i=0; i<NUM_THREADS; i++)
      {
        int rc = pthread_create(&threads[i], &attr, worker, (void*)i);
        if (rc != 0)
        {
          fprintf(stderr, "Failed to create thread %ld\n", i);
        }
      }
    
      while (stop == false)
      {
        // produce data in bursts
        int numValsToInsert = rand() % (STACK_SIZE - sp);
        printf("main producing %d values\n", numValsToInsert);
        // acquire the lock
        pthread_mutex_lock(&m);
    
        for (int i=0; i<numValsToInsert; i++)
        {
          // produce values for the stack
          int val = rand() % 10000;
          // I think this should already be guaranteed..?
          if (sp+1 < STACK_SIZE)
          {
            printf("main pushing stack[%d] = %d\n", sp, val);
            stack[sp++] = val;
            // signal the workers that data is ready
            //printf("main signaling threads...\n");
            //pthread_cond_signal(&c);
          }
          else
          {
            printf("stack full!\n");
          }
        }
    
        pthread_mutex_unlock(&m);
    
        // signal the workers that data is ready
        printf("main signaling threads...\n");
        pthread_cond_broadcast(&c);  
    
        int sleepVal = 1;//rand() % 5;
        printf("main sleeping for %d seconds...\n", sleepVal);
        sleep(sleepVal);    
      }
    
    
      for (long i=0; i<NUM_THREADS; i++)
      {
        pthread_join(threads[i], NULL);
      }
    
      return 0;
    }