Question

我目前正在尝试对执行任意作业的大型循环的各种实现进行基准测试，并且在使用boost变换迭代器和boost counting_iterators时发现自己的版本非常慢。

我设计了一个小代码，对两个循环进行基准测试，将0和SIZE-1之间所有整数的乘积与任意整数相加（我在示例中选择为1以避免溢出）。

她是我的代码：

//STL
#include <iostream>
#include <algorithm>
#include <functional>
#include <chrono>

//Boost
#include <boost/iterator/transform_iterator.hpp>
#include <boost/iterator/counting_iterator.hpp>

//Compile using
// g++ ./main.cpp -o test -std=c++11

//Launch using
// ./test 1

#define NRUN 10
#define SIZE 128*1024*1024

struct MultiplyByN
{
    MultiplyByN( size_t N ): m_N(N){};
    size_t operator()(int i) const { return i*m_N; }
    const size_t m_N;
};

int main(int argc, char* argv[] )
{
    int N = std::stoi( argv[1] );
    size_t sum = 0;
    //Initialize chrono helpers
    auto start = std::chrono::steady_clock::now();
    auto stop = std::chrono::steady_clock::now();
    auto diff = stop - start;
    double msec=std::numeric_limits<double>::max(); //Set min runtime to ridiculously high value
    MultiplyByN op(N);


    //Perform multiple run in order to get minimal runtime
    for(int k = 0; k< NRUN; k++)
    {
        sum = 0;
        start = std::chrono::steady_clock::now();
        for(int i=0;i<SIZE;i++)
        {
            sum += op(i);
        }
        stop = std::chrono::steady_clock::now();
        diff = stop - start;
        //Compute minimum runtime
        msec = std::min( msec, std::chrono::duration<double, std::milli>(diff).count() );
    }
    std::cout << "First version : Sum of values is "<< sum << std::endl;
    std::cout << "First version : Minimal Runtime was "<< msec << " msec "<< std::endl;
    msec=std::numeric_limits<double>::max(); //Reset min runtime to ridiculously high value

    //Perform multiple run in order to get minimal runtime
    for(int k = 0; k< NRUN; k++)
    {
        start = std::chrono::steady_clock::now();

        //Functional way to express the summation
        sum = std::accumulate(  boost::make_transform_iterator(boost::make_counting_iterator(0), op ),
                        boost::make_transform_iterator(boost::make_counting_iterator(SIZE), op ),
                        (size_t)0, std::plus<size_t>() );

        stop = std::chrono::steady_clock::now();
        diff = stop - start;
        //Compute minimum runtime
        msec = std::min( msec, std::chrono::duration<double, std::milli>(diff).count() );
    }
    std::cout << "Second version : Sum of values is "<< sum << std::endl;
    std::cout << "Second version version : Minimal Runtime was "<< msec << " msec "<< std::endl;
    return EXIT_SUCCESS;
}

我得到的输出：

./test 1
First version : Sum of values is 9007199187632128
First version : Minimal Runtime was 433.142 msec 
Second version : Sum of values is 9007199187632128
Second version version : Minimal Runtime was 10910.7 msec

使用std :: accumulate的我的循环的“功能”版本比简单循环版本慢25倍，为什么会这样？

提前感谢您的帮助

Answer 1

根据您在代码中的评论，您已使用

进行了编译

g++ ./main.cpp -o test -std=c++11

由于您未指定优化级别，因此g ++使用默认设置，即-O0，即无优化。

这意味着编译器没有内联任何东西。标准库或boost等模板库依赖于内联性能。此外，编译器将产生大量额外的代码，这远远不是最优的 - 对这样的二进制文件进行性能比较没有任何意义。

在启用优化的情况下重新编译，然后再次尝试测试以获得有意义的结果。

boost transform_iterator和counting_iterator的性能问题

1 个答案: