第一轮

Question

所以我一直在阅读this question并使用其代码。它最初发布已经过了几年，我很好奇新的编译器如何处理它。但是，我在g++4.9.1发现的事情让我非常困惑。

我有这段代码：

#include <cstdlib>
#include <vector>

#include <iostream>
#include <string>

#include <boost/date_time/posix_time/ptime.hpp>
#include <boost/date_time/microsec_time_clock.hpp>

constexpr int outerLoopBound = 1000;

class TestTimer {
    public:
        TestTimer(const std::string & name) : name(name),
        start(boost::date_time::microsec_clock<boost::posix_time::ptime>::local_time()) {}

        ~TestTimer() {
            using namespace std;
            using namespace boost;

            posix_time::ptime now(date_time::microsec_clock<posix_time::ptime>::local_time());
            posix_time::time_duration d = now - start;

            cout << name << " completed in " << d.total_milliseconds() / 1000.0 <<
            " seconds" << endl;
        }

    private:
        std::string name;
        boost::posix_time::ptime start;
};

struct Pixel {
    Pixel() {}

    Pixel(unsigned char r, unsigned char g, unsigned char b) : r(r), g(g), b(b) {}

    unsigned char r, g, b;
};

double UseVector() {
    TestTimer t("UseVector");
    double sum = 0.0;

    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;

        std::vector<Pixel> pixels;
        pixels.resize(dimension * dimension);

        for(int i = 0; i < dimension * dimension; ++i) {
            pixels[i].r = 255;
            pixels[i].g = 0;
            pixels[i].b = 0;
        }
        sum += pixels[0].b;
    }
    return sum;
}

double UseVector2() {
    TestTimer t("UseVector2");
    double sum = 0.0;
    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;

        std::vector<Pixel> pixels(dimension*dimension, Pixel(255,0,0));
        sum += pixels[0].b;
    }
    return sum;
}

double UseVector3() {
    TestTimer t("UseVector3");
    double sum = 0.0;
    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;
        Pixel p(255, 0, 0);

        std::vector<Pixel> pixels(dimension*dimension, p);
        sum += pixels[0].b;
    }
    return sum;
}

double UseVectorPushBack() {
    TestTimer t("UseVectorPushBack");

    double sum = 0.0;
    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;

        std::vector<Pixel> pixels;
        pixels.reserve(dimension * dimension);

        for(int i = 0; i < dimension * dimension; ++i)
            pixels.push_back(Pixel(255, 0, 0));

        sum += pixels[0].b;
    }
    return sum;
}

void UseVectorEmplaceBack() {
    TestTimer t("UseVectorPushBack");

    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;

        std::vector<Pixel> pixels;
        pixels.reserve(dimension * dimension);

        for(int i = 0; i < dimension * dimension; ++i)
            pixels.emplace_back(Pixel(255, 0, 0));
    }
}

double UseArray() {
    TestTimer t("UseArray");

    double sum = 0.0;
    for(int i = 0; i < outerLoopBound; ++i) {
        int dimension = 999;

        Pixel * pixels = (Pixel *)malloc(sizeof(Pixel) * dimension * dimension);

        for(int i = 0 ; i < dimension * dimension; ++i) {
            pixels[i].r = 255;
            pixels[i].g = 0;
            pixels[i].b = 0;
        }

        sum += pixels[0].b;
        free(pixels);
    }
    return sum;
}

int main()
{
    TestTimer t1("The whole thing");

    double result = 0.0;
    result += UseArray();
    result += UseVector();
    result += UseVector2();
    result += UseVector3();
    result += UseVectorPushBack();
    std::cout << "Result is: " << result << '\n';

    return 0;
}

我基本上修改了一些原始代码，希望避免编译器使所有内容无效。所以我们有：

UseVector：创建空向量，使用resize，循环并设置所有Pixel。
UseVector2：直接创建所需大小的向量，并从临时实例化Pixel。
UseVector3：直接创建所需大小的向量，并从单个Pixel实例化lvalue。
UseVectorPushBack：创建空向量，使用reserve，Pixel添加push_back。
UseVectorEmplaceBack：创建空向量，使用reserve，Pixel添加emplace_back。
UseArray：mallocs一个数组，循环设置所有Pixel s，deallocates。

此外，所有这些函数都会在sum变量中累积值，并返回该变量以防止编译器消除循环。在main函数中，我测试了所有这些函数但 UseVectorEmplaceBack。这对以后很重要。

所以我用以下标志编译：{{1}}。我在FX-8350上。

第一轮

按原样，代码为我生成：

g++ -O3 -march=native -std=c++11 main.cpp

首先我注意到的是UseArray completed in 0.248 seconds UseVector completed in 0.245 seconds UseVector2 completed in 0.872 seconds UseVector3 completed in 0.981 seconds UseVectorPushBack completed in 4.779 seconds Result is: 0 The whole thing completed in 7.126 seconds，它在原始问题中较慢，现在作为C数组使用，即使理论上它应该对数据进行两次传递。

另一方面，UseVector和UseVector2的速度是UseVector3的4倍。 这对我来说很奇怪，为什么会发生这种情况？。

第二轮

好的，我们有一个UseVector功能，但我们并没有真正测试它。为什么不评论呢？所以我们对它进行评论，然后我们再次尝试代码：

UseVectorEmplaceBack

好的，显然在UseArray completed in 0.246 seconds UseVector completed in 0.245 seconds UseVector2 completed in 0.984 seconds UseVector3 completed in 0.8 seconds UseVectorPushBack completed in 4.771 seconds Result is: 0 The whole thing completed in 7.047 seconds之前比UseVector2快一点，现在情况已经逆转了。即使多次运行代码后，此结果仍会发生。 所以我们通过评论未使用的功能来改变两个功能的运行时间。等什么？

第三轮

从这里继续，此时我们认为UseVector3由于某种原因是更快的。{1}}。我们希望让它更快，为此我们在UseVector3中评论以下行以减少其工作量：

UseVector3

由于我们令人难以置信的编码能力，现在功能将更快！让我们测试一下：

// sum += pixels[0].b;

好的，所以我们从UseArray completed in 0.245 seconds UseVector completed in 0.244 seconds UseVector2 completed in 0.81 seconds UseVector3 completed in 0.867 seconds UseVectorPushBack completed in 4.778 seconds Result is: 0 The whole thing completed in 6.946 seconds移除了一个操作，放慢速度，而UseVector3未触动，变得比其他更快。< / p>

结论

除此之外，还有许多其他奇怪的行为，无法提及。看起来这个代码的每个随机编辑都会产生奇怪的效果。现在我只是对我在这里展示的三件事感到好奇：

为什么UseVector2比UseVector和UseVector2都快？
为什么评论未使用的功能会改变其他两个功能的时间，而不是其他功能？
为什么从函数中删除一个操作会减慢它，但会加速另一个操作？

std :: vector的奇怪行为，你能解释一下吗？

第一轮

第二轮

第三轮

结论

0 个答案: