如何判断我的ThreadPool何时完成其任务?

时间:2015-11-11 22:55:23

标签: c++ multithreading c++11 threadpool

在c ++ 11中,我有一个ThreadPool对象,它管理通过单个lambda函数入队的许多线程。我知道有多少行数据需要处理,所以我提前知道我需要排队N个作业。我不确定的是如何判断所有这些工作何时完成,所以我可以继续下一步。

这是管理ThreadPool的代码:

#include <cstdlib>
#include <vector>
#include <deque>
#include <iostream>
#include <atomic>
#include <thread>
#include <mutex>
#include <condition_variable>

class ThreadPool;

class Worker {
public:
    Worker(ThreadPool &s) : pool(s) { }
    void operator()();
private:
    ThreadPool &pool;
};

class ThreadPool {
public:
    ThreadPool(size_t);
    template<class F>
    void enqueue(F f);
    ~ThreadPool();
    void joinAll();
    int taskSize();

private:
    friend class Worker;

    // the task queue
    std::deque< std::function<void()> > tasks;

    // keep track of threads
    std::vector< std::thread > workers;

    // sync
    std::mutex queue_mutex;
    std::condition_variable condition;
    bool stop;
};

void Worker::operator()()
{
    std::function<void()> task;
    while(true)
    {
        {   // acquire lock
            std::unique_lock<std::mutex> 
                lock(pool.queue_mutex);

            // look for a work item
            while ( !pool.stop && pool.tasks.empty() ) {
                // if there are none wait for notification
                pool.condition.wait(lock);
            }

            if ( pool.stop )  {// exit if the pool is stopped
                return;
            }

            // get the task from the queue
            task = pool.tasks.front();
            pool.tasks.pop_front();

        }   // release lock

        // execute the task
        task();
    }
}


// the constructor just launches some amount of workers
ThreadPool::ThreadPool(size_t threads)
    :   stop(false)
{
    for (size_t i = 0;i<threads;++i) {
        workers.push_back(std::thread(Worker(*this)));
    }

    //workers.
    //tasks.
}

// the destructor joins all threads
ThreadPool::~ThreadPool()
{
    // stop all threads
    stop = true;
    condition.notify_all();

    // join them
    for ( size_t i = 0;i<workers.size();++i) {
        workers[i].join();
    }
}

void ThreadPool::joinAll() {
    // join them
    for ( size_t i = 0;i<workers.size();++i) {
        workers[i].join();
    }
}

int ThreadPool::taskSize() {
    return tasks.size();
}

// add new work item to the pool
template<class F>
void ThreadPool::enqueue(F f)
{
    { // acquire lock
        std::unique_lock<std::mutex> lock(queue_mutex);

        // add the task
        tasks.push_back(std::function<void()>(f));
    } // release lock

    // wake up one thread
    condition.notify_one();
}

然后我在这样的线程中分配我的工作:

ThreadPool pool(4);
/* ... */
for (int y=0;y<N;y++) {
    pool->enqueue([this,y] {
        this->ProcessRow(y);
    });
}

// wait until all threads are finished
std::this_thread::sleep_for( std::chrono::milliseconds(100) );

等待100毫秒的工作只是因为我知道这些工作可以在比100毫秒更短的时间内完成,但显然它不是最好的方法。一旦它完成了N行处理,它需要经历另外1000代左右的同样的事情。显然,我希望尽快开始下一代。

我知道必须有一些方法可以将代码添加到我的ThreadPool中,以便我可以这样做:

while ( pool->isBusy() ) {
    std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}

我已经在这里工作了几个晚上,我发现很难找到如何做到这一点的好例子。 那么,实现我的isBusy()方法的正确方法是什么?

3 个答案:

答案 0 :(得分:2)

我明白了!

首先,我向ThreadPool类引入了一些额外的成员:

class ThreadPool {
    /* ... exisitng code ... */
    /* plus the following */
    std::atomic<int> njobs_pending;
    std::mutex main_mutex;
    std::condition_variable main_condition;
}

现在,我可以比每X次检查一些状态做得更好。现在,我可以阻止主循环,直到没有更多的作业挂起:

void ThreadPool::waitUntilCompleted(unsigned n) {
    std::unique_lock<std::mutex> lock(main_mutex);
    main_condition.wait(lock);
}

只要我使用以下簿记代码管理待处理的内容,就在ThreadPool.enqueue()函数的头部:

njobs_pending++;

在我在Worker :: operator()()函数中运行任务之后:

if ( --pool.njobs_pending == 0 ) {
    pool.main_condition.notify_one();
}

然后主线程可以排队任何必要的任务,然后坐下来等待所有计算完成:

for (int y=0;y<N;y++) {
    pool->enqueue([this,y] {
        this->ProcessRow(y);
    });
}
pool->waitUntilCompleted();

答案 1 :(得分:1)

您可能需要创建与bool变量标志关联的线程的内部结构。

class ThreadPool {
private:
    // This Structure Will Keep Track Of Each Thread's Progress
    struct ThreadInfo {
        std::thread thread;
        bool        isDone;

        ThreadInfo( std::thread& threadIn ) : 
            thread( threadIn ), isDone(false) 
        {}
    }; // ThredInfo

    // This Vector Should Be Populated In The Constructor Initially And
    // Updated Anytime You Would Add A New Task.
    // This Should Also Replace // std::vector<std::thread> workers
    std::vector<ThreadInfo> workers;

public:
    // The rest of your class would appear to be the same, but you need a
    // way to test if a particular thread is currently active. When the
    // thread is done this bool flag would report as being true;

    // This will only return or report if a particular thread is done or not
    // You would have to set this variable's flag for a particular thread to
    // true when it completes its task, otherwise it will always be false
    // from moment of creation. I did not add in any bounds checking to keep
    // it simple which should be taken into consideration.
    bool isBusy( unsigned idx ) const {
        return workers[idx].isDone;
    }
};

答案 2 :(得分:0)

如果你有N个作业并且必须通过调用线程休眠来等待它们,那么最有效的方法是创建一个变量,在调度作业和每个作业之前由原子操作设置为N当完成计算时,会有变量的原子减量。然后你可以使用原子指令来测试变量是否为零。

当变量将减少为零时,使用等待句柄锁定减量。

我只想说,我不喜欢你要求的这个想法:

while ( pool->isBusy() ) {
    std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}

它只是不合适,它几乎不会是1ms,它不必要地使用资源等......

最好的方法是原子地减少一些变量,并且如果全部完成则自动测试变量,最后一个作业将简单地基于原子测试集WaitForSingleObject。 如果必须的话,等待将在WaitForSingleObject上,并且在完成后会醒来,而不是很多次。

WaitForSingleObject