Question

我有一个问题。

我今天玩线程。我想确保在2核处理器上，当我将任务分成两部分时，我将获得接近2倍的执行时间增加（此处不涉及锁）。

我想使用简单的机制而不需要像async等任何花哨的东西，只是一个简单的线程运行一个执行一些非常简单的caclulations的函数。我写了一个函数，对给定范围内的整数求和，并将结果保存在引用传递的ouptput参数中。

我添加了测量时间并使用1个线程运行示例。

然后我改变程序在两个独立的线程中运行相同的功能，具有分割范围。当然为了通过引用将结果返回给主线程，我必须使用std :: ref（以便能够创建一个线程）。当我注意到程序的执行时间几乎没有改变时，我感到非常高兴。

然后我创建了一个新函数，但是使用了acepting reference_wrapper并在1线程程序中运行它。现在执行时间是2个线程程序的2倍。所以我以某种方式证实，与1线程程序相比，这样的2线程程序可以获得2倍的加速，当它们都使用std :: ref时。但不好的是，我可以运行单线程程序，而不必费心使用std :: ref，这个程序运行速度与std :: ref的2线程程序一样快。

下面我粘贴包含所有3个选项的程序和我的2核机器上的输出。

#include <iostream>
#include <chrono>
#include <thread>
#include <vector>

using namespace std;
using namespace std::chrono;

int work(uint64_t b, uint64_t e, uint64_t& sum)
{
    for (int i = b; i < e; ++i)
    {
        sum += i;
    }
}

int work1(uint64_t b, uint64_t e, std::reference_wrapper<uint64_t> sum)
{
    for (int i = b; i < e; ++i)
    {
        sum += i;
    }
}

int main() {
    uint64_t sum = 0u;
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    work(0, 1000000000000000000, sum);
    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto duration = duration_cast<microseconds>( t2 - t1 ).count();
    cout << "one thread, duration = " << duration << endl;

    sum = 0u;
    t1 = high_resolution_clock::now();
    work1(0, 1000000000000000000, std::ref(sum));
    t2 = high_resolution_clock::now();
    duration = duration_cast<microseconds>( t2 - t1 ).count();
    cout << "one thread, std::ref, duration = " << duration << endl;

    uint64_t sum21 = 0u;
    uint64_t sum22 = 0u;
    t1 = high_resolution_clock::now();
    thread th1(work, 0, 500000000000000000, std::ref(sum21));
    thread th2(work, 500000000000000000, 1000000000000000000, std::ref(sum22));
    th1.join();
    th2.join();
    sum = sum21 + sum22;
    t2 = high_resolution_clock::now();
    duration = duration_cast<microseconds>( t2 - t1 ).count();
    cout << "two threads, std::ref, duration = " << duration << endl;

    return 0;
}

结果：

one thread, duration = 7777627
one thread, std::ref, duration = 17021098
two threads, std::ref, duration = 7426095

我的问题是：为什么std :: ref减慢了这么多？

此致

YK

通过std :: ref将参数传递给线程会减慢执行速度吗？

0 个答案: