Question

我正在使用SDL和Pthread在C ++中开发Ray Tracer。我有问题使我的程序使用两个核心。线程工作，但它们不使用两个核心100％。为了接口SDL，我直接写入它的内存SDL_Surface.pixels，所以我认为它不能是SDL锁定我。

我的线程功能如下所示：

void* renderLines(void* pArg){
while(true){
    //Synchronize
    pthread_mutex_lock(&frame_mutex);
    pthread_cond_wait(&frame_cond, &frame_mutex);
    pthread_mutex_unlock(&frame_mutex);

    renderLinesArgs* arg = (renderLinesArgs*)pArg;
    for(int y = arg->y1; y < arg->y2; y++){
        for(int x = 0; x < arg->width; x++){
            Color C = arg->scene->renderPixel(x, y);
            putPixel(arg->screen, x, y, C);
        }
    }

    sem_post(&frame_rendered);
    }
}

注意：scene-＆gt; renderPixel是const，所以我假设两个线程都可以从同一个内存中读取。我有两个工作线程，在我的主循环中我使用：

//Signal a new frame
pthread_mutex_lock(&frame_mutex);
pthread_cond_broadcast(&frame_cond);
pthread_mutex_unlock(&frame_mutex);

//Wait for workers to be done
sem_wait(&frame_rendered);
sem_wait(&frame_rendered);

//Unlock SDL surface and flip it...

注意：我也尝试过创建和加入线程而不是同步它们。我用“-lpthread -D_POSIX_PTHREAD_SEMANTICS -pthread”编译它，gcc不会抱怨。

使用执行期间CPU使用情况的图表可以最好地说明我的问题：
_{（来源：jopsen.dk）}

从图中可以看出，我的程序一次只使用一个核心，然后每隔一段时间在两个核心之间切换，但它不能同时驱动100％。我做错了什么？我没有在场景中使用任何互斥或信号量。我该怎么做才能找到错误？

另外如果我把while（true）放在scene-＆gt; renderPixel（）周围，我可以将两个核心都推到100％。所以我怀疑这是由开销引起的，但考虑到复杂的场景，我只会每0.5秒同步一次（例如FPS：0.5）。我意识到告诉我我的bug是什么并不容易，但调试这个的方法也很棒......我之前没有使用过pthreads ...

此外，这可能是硬件或内核问题，我的内核是：

$uname -a
Linux jopsen-laptop 2.6.27-14-generic #1 SMP Fri Mar 13 18:00:20 UTC 2009 i686 GNU/Linux

注意：

Answer 1

这没用：

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

如果您等待新框架，请执行以下操作：

int new_frame = 0;

第一个帖子：

pthread_mutex_lock(&mutex); 
new_frame = 1; 
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

其他主题：

pthread_mutex_lock(&mutex); 
while(new_frame == 0)
  pthread_cond_wait(&cond, &mutex); 
/* Here new_frame != 0, do things with the frame*/
pthread_mutex_unlock(&mutex);

pthread_cond_wait（），实际释放互斥锁，并取消调度线程，直到发出条件信号。当发出条件信号时，线程被唤醒并重新获取互斥锁。所有这些都发生在pthread_cond_wait（）函数

中

Answer 2

我会在黑暗中采取疯狂的刺，并说你的工作线程花了很多时间等待条件变量。为了在这种代码主要受CPU限制的情况下获得良好的CPU性能，可以理解使用面向任务的编程风格，将线程视为“池”，并使用队列结构将工作提供给他们。他们应该花费很少的时间从队列中抽出工作，并且大部分时间都在做实际的工作。

你现在所拥有的是这样一种情况：他们可能正在做一段时间的工作，然后通过信号量通知主线程他们已经完成了。主线程将不会释放它们，直到两个线程完成它们当前正在处理的帧的工作。

由于您使用的是C ++，您是否考虑过使用Boost.Threads？它使得处理多线程代码变得更加容易，并且API实际上类似于pthreads，但是采用“现代C ++”方式。

Answer 3

我不是pthreads大师，但在我看来，以下代码是错误的：

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

引用this article

pthread_cond_wait()阻止了通话线程直到指定的条件发出信号。这个例程应该是当互斥锁被锁定时调用它将自动释放互斥锁而它等待。信号发出后收到并且线程被唤醒，互斥量将自动锁定使用由线程。那么程序员就是负责解锁互斥时线程已完成。

因此在我看来，您应该在<{1}}之后的代码块之后释放互斥锁。

使用pthread来利用多个核心的问题

3 个答案: