Question

我有很多工作，我想并行运行其中的一部分。例如我有100个作业要运行，我想一次运行10个线程。这是我当前针对此问题的代码：

#include <thread>
#include <vector>
#include <iostream>
#include <atomic>
#include <random>
#include <mutex>

int main() {
    constexpr std::size_t NUMBER_OF_THREADS(10);
    std::atomic<std::size_t> numberOfRunningJobs(0);

    std::vector<std::thread> threads;
    std::mutex maxThreadsMutex;
    std::mutex writeMutex;
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, 2);

    for (std::size_t id(0); id < 100; ++id) {
        if (numberOfRunningJobs >= NUMBER_OF_THREADS - 1) {
            maxThreadsMutex.lock();
        }
        ++numberOfRunningJobs;
        threads.emplace_back([id, &numberOfRunningJobs, &maxThreadsMutex, &writeMutex, &distribution, &generator]() {
            auto waitSeconds(distribution(generator));
            std::this_thread::sleep_for(std::chrono::seconds(waitSeconds));
            writeMutex.lock();
            std::cout << id << " " << waitSeconds << std::endl;
            writeMutex.unlock();
            --numberOfRunningJobs;
            maxThreadsMutex.unlock();
        });
    }

    for (auto &thread : threads) {
        thread.join();
    }

    return 0;
}

在for循环中，我检查了有多少个作业正在运行，如果某个插槽空闲，则向该向量添加一个新线程。在每个线程的末尾，我减少正在运行的作业的数量并解锁互斥锁以启动一个新线程。这解决了我的任务，但有一点我不喜欢。我需要一个大小为100的向量来存储所有线程，并且需要在最后加入所有100个线程。我想在向量完成后从向量中删除每个线程，以便向量最多包含10个线程，并且我必须在最后加入10个线程。我考虑通过参考lambda传递矢量和迭代器，以便可以在最后删除元素，但我不知道如何。如何优化我的代码以在向量中最多使用10个元素？

Answer 1

由于您似乎不需要极细粒度的线程控制，因此建议您使用OpenMP解决此问题。 OpenMP是一种基于行业标准的基于指令的方法，用于并行化C，C ++和FORTRAN代码。这些语言的每种主要编译器都可以实现。

使用它可以大大降低代码的复杂性：

constexpr

要使用OpenMP，请使用以下命令进行编译：

#include <iostream>
#include <random>

int main() {
    constexpr std::size_t NUMBER_OF_THREADS(10);

    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, 2);

    //Distribute the loop between threads ensuring that only
    //a specific number of threads are ever active at once.
    #pragma omp parallel for num_threads(NUMBER_OF_THREADS)
    for (std::size_t id(0); id < 100; ++id) {
        #pragma omp critical //Serialize access to generator
        auto waitSeconds(distribution(generator));

        std::this_thread::sleep_for(std::chrono::seconds(waitSeconds));

        #pragma omp critical //Serialize access to cout
        std::cout << id << " " << waitSeconds << std::endl;
    }        

    return 0;
}

有时有时需要生成并直接协调线程，但是为简化并行性而设计的大量新语言和库表明，在许多用例中，简单的并行性路径就足够了。

Answer 2

关键字“线程池”对我有很大帮助。我尝试了boost :: asio :: thread_pool，它的工作方式与我的第一种方法相同。我用

解决了我的问题

#include <thread>
#include <iostream>
#include <atomic>
#include <random>
#include <mutex>
#include <boost/asio/thread_pool.hpp>
#include <boost/asio/post.hpp>

int main() {
    boost::asio::thread_pool threadPool(10);
    std::mutex writeMutex;
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, 2);
    std::atomic<std::size_t> currentlyRunning(0);

    for (std::size_t id(0); id < 100; ++id) {
        boost::asio::post(threadPool, [id, &writeMutex, &distribution, &generator, &currentlyRunning]() {
            ++currentlyRunning;
            auto waitSeconds(distribution(generator));
            writeMutex.lock();
            std::cout << "Start: " << id << " " << currentlyRunning << std::endl;
            writeMutex.unlock();
            std::this_thread::sleep_for(std::chrono::seconds(waitSeconds));
            writeMutex.lock();
            std::cout << "Stop: " << id << " " << waitSeconds << std::endl;
            writeMutex.unlock();
            --currentlyRunning;
        });
    }

    threadPool.join();
    return 0;
}

从向量中删除完成的线程

2 个答案: