相同的功能,使用 Google 基准测试的不同性能结果

时间:2020-12-23 18:25:46

标签: c++ google-benchmark

我试图让自己熟悉 google 基准测试框架,并决定使用著名的前/后增量运行测试。然而,我发现在执行相同的函数时,实际上是相同的代码,我在时间测量方面得到了不同的结果。

我的测试包含三个功能:

  • incrementA,只是一个没有什么特别的 for 循环
  • incrementBincrementA
  • 的副本
  • increment 调用 incrementA

通过这三个函数,我写了一个fixture,然后注册了测试。

#include <assert.h>
#include <stdint.h>

#include <benchmark/benchmark.h>

//---------------------------------------------------------------------

void incrementA(int COUNT) {
    volatile int a[COUNT+1];
    int i = 0;
    for (int j = 0; j < 1000; j++) {
        i = 0;
        for (int k = 0; k < COUNT; k++) {
            a[i++] = k + j;
        }
    }
}

void incrementB(int COUNT) {
    volatile int a[COUNT+1];
    int i = 0;
    for (int j = 0; j < 1000; j++) {
        i = 0;
        for (int k = 0; k < COUNT; k++) {
            a[i++] = k + j;
        }
    }
}

void increment(int COUNT) {
    incrementA(COUNT);
}

//---------------------------------------------------------------------

class PrePostIncrement : public ::benchmark::Fixture
{
public:
    void SetUp(const ::benchmark::State& st)
    {
        size = st.range(0);
    }

    void TearDown(const ::benchmark::State&)
    {
    }

    static void CustomArguments(benchmark::internal::Benchmark* b)
    {
        size_t minSize = 8;
        for (int i = 0; (1 << (i + minSize)) < (1 << 20); ++i)
            b->Arg(1 << (i + minSize));
    }
    int size;
};


//---------------------------------------------------------------------


#define REGISTER_TEST(IncrementFunction)                                                \
    using IncrementFunction##_Test = PrePostIncrement;                                  \
    BENCHMARK_DEFINE_F(IncrementFunction##_Test, Obj)(benchmark::State& state)          \
    {                                                                                   \
        while (state.KeepRunning())                                                     \
        {                                                                               \
            IncrementFunction(size);                                                    \
        }                                                                               \
    }                                                                                   \
    BENCHMARK_REGISTER_F(IncrementFunction##_Test, Obj)->Apply(IncrementFunction##_Test::CustomArguments)->Unit(benchmark::kMillisecond);


REGISTER_TEST(incrementA);
REGISTER_TEST(incrementB);
REGISTER_TEST(increment);

BENCHMARK_MAIN();

编译:

$ g++ increment_benchmark.cpp -std=gnu++14 -march=native -pthread -O3 -I/home/user/software/benchmark/include -L/home/user/software/benchmark/build/src -Wl,-rpath=/home/user/software/benchmark/build/src -lbenchmark

并且结果不一致,例如通过交换测试顺序,我得到了不同的结果。

---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
incrementA_Test/Obj/256         0.125 ms        0.125 ms         5499
incrementA_Test/Obj/512         0.244 ms        0.244 ms         2868
incrementA_Test/Obj/1024        0.482 ms        0.482 ms         1439
incrementA_Test/Obj/2048        0.971 ms        0.971 ms          715
incrementA_Test/Obj/4096         1.91 ms         1.91 ms          361
incrementA_Test/Obj/8192         3.82 ms         3.82 ms          180
incrementA_Test/Obj/16384        7.77 ms         7.77 ms           90
incrementA_Test/Obj/32768        15.6 ms         15.6 ms           45
incrementA_Test/Obj/65536        30.5 ms         30.5 ms           23
incrementA_Test/Obj/131072       61.7 ms         61.7 ms           11
incrementA_Test/Obj/262144        122 ms          122 ms            6
incrementA_Test/Obj/524288        245 ms          245 ms            3
incrementB_Test/Obj/256         0.084 ms        0.084 ms         8246
incrementB_Test/Obj/512         0.166 ms        0.166 ms         4212
incrementB_Test/Obj/1024        0.321 ms        0.321 ms         2175
incrementB_Test/Obj/2048        0.629 ms        0.629 ms         1109
incrementB_Test/Obj/4096         1.23 ms         1.23 ms          564
incrementB_Test/Obj/8192         2.42 ms         2.42 ms          288
incrementB_Test/Obj/16384        4.84 ms         4.84 ms          142
incrementB_Test/Obj/32768        9.63 ms         9.63 ms           72
incrementB_Test/Obj/65536        20.3 ms         20.3 ms           34
incrementB_Test/Obj/131072       40.8 ms         40.8 ms           17
incrementB_Test/Obj/262144       81.7 ms         81.7 ms            8
incrementB_Test/Obj/524288        164 ms          164 ms            4
increment_Test/Obj/256          0.126 ms        0.126 ms         5551
increment_Test/Obj/512          0.244 ms        0.244 ms         2861
increment_Test/Obj/1024         0.482 ms        0.482 ms         1453
increment_Test/Obj/2048         0.958 ms        0.958 ms          721
increment_Test/Obj/4096          1.91 ms         1.91 ms          364
increment_Test/Obj/8192          3.82 ms         3.82 ms          183
increment_Test/Obj/16384         7.63 ms         7.63 ms           91
increment_Test/Obj/32768         15.2 ms         15.2 ms           46
increment_Test/Obj/65536         30.5 ms         30.5 ms           23
increment_Test/Obj/131072        61.0 ms         61.0 ms           11
increment_Test/Obj/262144         122 ms          122 ms            6
increment_Test/Obj/524288         244 ms          244 ms            3

最初我认为可能是扩展策略(powersave)可能会影响结果,但是将其更改为性能后,结果是一样的。

仅供参考,我编译了google框架(bf585a2 [v1.5.2]),我的库是:

$ ldd --version
ldd (Ubuntu GLIBC 2.27-3ubuntu1.2) 2.27
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
$ g++ --version
g++ (Ubuntu 9.2.1-17ubuntu1~18.04.1) 9.2.1 20191102
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

我很确定有不同的方法可以编写相同的测试,欢迎阅读任何建议,但我的主要兴趣是了解我的代码有什么问题,以及为什么我得到不同的结果< /strong>。

0 个答案:

没有答案