Question

我有一些代码试图运行一些强烈的矩阵处理，所以我认为如果我多线程它会更快。但是，我的意图是保持线程处于活动状态，以便将来可以使用它来进行更多处理。这是问题，代码的多线程版本运行速度比单个线程慢，我相信问题在于我发出信号/保持线程活动的方式。

我在Windows和C ++上使用pthreads。这是我的线程代码，其中 runtest（）是矩阵计算发生的函数：

void* playQueue(void* arg)
{
    while(true)
    {
        pthread_mutex_lock(&queueLock);
        if(testQueue.empty())
            break;
        else
            testQueue.pop();
        pthread_mutex_unlock(&queueLock);
        runtest();
    }
    pthread_exit(NULL); 
}

playQueue（）函数是传递给pthread的函数，而我现在所拥有的是，有一个队列（testQueue）可以说1000个项目，而且有100个线程。每个线程将继续运行，直到队列为空（因此互斥体内的东西）。

我认为多线程运行速度如此之慢的原因是因为称为虚假共享（我认为？）和我发出线程调用 runtest（）并保持线程活动的方法是差。

这样做的有效方法是什么，以便多线程版本比迭代版本运行得更快（或者至少同样快）？

这是我的代码的完整版本（减去矩阵的东西）

# include <cstdlib>
# include <iostream>
# include <cmath>
# include <complex>
# include <string>
# include <pthread.h>
# include <queue>

using namespace std;

# include "matrix_exponential.hpp"
# include "test_matrix_exponential.hpp"
# include "c8lib.hpp"
# include "r8lib.hpp"

# define NUM_THREADS 3

int main ( );
int counter;
queue<int> testQueue;
queue<int> anotherQueue;
void *playQueue(void* arg);
void runtest();
void matrix_exponential_test01 ( );
void matrix_exponential_test02 ( );
pthread_mutex_t anotherLock;
pthread_mutex_t queueLock;
pthread_cond_t queue_cv;

int main ()

{
    counter = 0;

   /* for (int i=0;i<1; i++)
        for(int j=0; j<1000; j++)
        {
            runtest();
          cout << counter << endl;
        }*/

    pthread_t threads[NUM_THREADS];
    pthread_mutex_init(&queueLock, NULL);
    pthread_mutex_init(&anotherLock, NULL);
    pthread_cond_init (&queue_cv, NULL);
    for(int z=0; z<1000; z++)
    {
        testQueue.push(1);
    }
    for( int i=0; i < NUM_THREADS; i++ )
    {
       pthread_create(&threads[i], NULL, playQueue, (void*)NULL);
    }
    while(anotherQueue.size()<NUM_THREADS)
    {

    }
    cout << counter;
    pthread_mutex_destroy(&queueLock);
    pthread_cond_destroy(&queue_cv);
    pthread_cancel(NULL);
    cout << counter;
    return 0;
}

void* playQueue(void* arg)
{
    while(true)
    {
        cout<<counter<<endl;
        pthread_mutex_lock(&queueLock);
        if(testQueue.empty()){
                pthread_mutex_unlock(&queueLock);
            break;
        }
        else
            testQueue.pop();
        pthread_mutex_unlock(&queueLock);
        runtest();
    }
    pthread_mutex_lock(&anotherLock);
    anotherQueue.push(1);
    pthread_mutex_unlock(&anotherLock);
    pthread_exit(NULL);
}

void runtest()
{
      counter++;
      matrix_exponential_test01 ( );
      matrix_exponential_test02 ( );
}

所以在这里，“matrix_exponential_tests”取自this website并获得许可，并且是所有矩阵数学发生的地方。该计数器仅用于调试并确保所有实例都在运行。

Answer 1

不会卡住吗？

while(true)
{
    pthread_mutex_lock(&queueLock);
    if(testQueue.empty())
        break; //<----------------you break without unlock the mutex...
    else
        testQueue.pop();
    pthread_mutex_unlock(&queueLock);
    runtest();
}

锁定和解锁之间的部分比单线程中的部分运行得慢。

互联网正在减慢你的速度。你应该只锁定关键部分，如果你想加快速度，尽量不要使用互斥锁。

您可以通过函数参数提供测试而不是使用队列来实现。

避免使用互斥锁的一种方法是使用vector而不删除std::atomic_int（c ++ 11）作为索引（或仅锁定获取当前索引和增量）< / p>

或使用像这样的迭代器：

vector<test> testVector;
vector<test>::iterator it;
//when it initialized to:
it = testVector.begin();

现在你的循环可以是这样的：

while(true)
{
    vector<test>::iterator it1;
    pthread_mutex_lock(&queueLock);
    it1 = (it==testVector.end())? it : it++; 
    pthread_mutex_unlock(&queueLock);

    //now you outside the critical section: 
    if(it==testVector.end())
        break; 
    //you don't delete or change the vector
    //so you can use the it1 iterator freely
    runtest();
}

发信号和保持pthread开放的有效方式？

1 个答案: