为什么按值返回的函数比使用pass_by_reference的函数慢?

时间:2019-05-16 16:15:13

标签: c++ optimization pass-by-reference benchmarking rvo

我了解c ++核心准则指定std :: vector应该按值返回(以使RVO / NRVO / move语义发生),而不是按引用传递操作。但是,当我使用下面的基准测试代码对此进行测试时,似乎pass_by_reference函数比按值返回的函数要快得多。为什么我的PassByReference Multiply函数比RVOMulitply函数这么快?

我正在使用clang 5.0.2。

我的编译行是clang++ -std=c++17 RVO_PassByReference.cpp -o RVO_PassByReference -O3 -march=native

#include <array>
#include <vector>
#include <chrono>
#include <iostream>

using namespace std;
using namespace std::chrono;

vector<double> RVOMultiply(const vector<double>& v1, const vector<double>& v2)
{
    std::vector<double> ResultVector;
    ResultVector.reserve(v1.size());
    for (size_t i {0}; i < v1.size(); ++i)
    {
        ResultVector.emplace_back(v1[i] * v2[i]);
    }
    return ResultVector;
}

void PassByReferenceMultiply(const vector<double>& v1, const vector<double>& v2, vector<double>& Result)
{
    for (size_t i {0}; i < Result.size(); ++i)
    {
        Result[i] = v1[i] * v2[i];
    }
}

int main ()
{

    vector<double> ReferenceVector(10000);
    vector<double> Operand1Vector(10000);
    vector<double> Operand2Vector(10000);

    for (size_t i {0}; i < Operand1Vector.size(); ++i)
    {
        Operand1Vector[i] = i;
        Operand2Vector[i] = i+1;
    }

    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto duration1 = duration_cast<nanoseconds>(t2 - t1).count();
    auto duration2 = duration_cast<nanoseconds>(t2 - t1).count();


    for (double z {0}; z < 100000; ++z)
    {
        t1 = high_resolution_clock::now();
        vector<double> RVOVector = RVOMultiply(Operand1Vector, Operand2Vector);
        t2 = high_resolution_clock::now();
        if (z != 99999)
            vector<double>().swap(RVOVector);

        duration1 += duration_cast<nanoseconds>(t2 - t1).count();


        t1 = high_resolution_clock::now();
        PassByReferenceMultiply(Operand1Vector, Operand2Vector, ReferenceVector);
        t2 = high_resolution_clock::now();
        duration2 += duration_cast<nanoseconds>(t2 - t1).count();

    }

    duration1 /= 100000;
    duration2 /= 100000;

    cout << "RVOVector Duration Average was: " << duration1 << endl;
    cout << "ReferenceVector push_back Duration Average was: " << duration2 << endl;

}

我在系统上的输出是

RVOVector Duration Average was: 11901 ReferenceVector push_back Duration Average was: 3634

0 个答案:

没有答案