Question

假设我有字符串向量，我想通过std :: accumulate连接它们。

如果我使用以下代码：

std::vector<std::string> foo{"foo","bar"};
string res=""; 
res=std::accumulate(foo.begin(),foo.end(),res,
  [](string &rs,string &arg){ return rs+arg; });

我可以肯定会有临时对象构建。

在this回答中，他们说std :: accumulate的效果是这样指定的：

通过初始化累加器来计算其结果初始值init然后用acc = acc + * i或acc =修改它 binary_op（acc，* i）表示[first，last]范围内的每个迭代器i 顺序。

所以我想知道这样做的正确方法是什么，以避免不必要的临时对象构建。

一个想法是以这种方式改变lambda：

[](string &rs,string &arg){ rs+=arg; return rs; }

在这种情况下，我认为我强制有效连接字符串并帮助编译器（我知道我shouldn't）省略了不必要的副本，因为这应该等同于（伪代码）：

accum = [](& accum,& arg){ ...; return accum; }

因此

accum = & accum;

另一个想法是使用

accum = [](& accum,& arg){ ...; return std::move(accum); }

但这可能会导致类似：

accum = std::move(& accum);

这对我来说非常可疑。

写这个的正确方法是什么，以尽量减少不必要的临时对象创建的风险？我不仅对std :: string感兴趣，我很乐意有一个解决方案，这可能适用于任何具有复制和移动构造函数/赋值的对象。

Answer 1

我会将其分解为两个操作，首先是std::accumulate以获取需要创建的字符串的总长度，然后是std::for_each，其中lambda更新本地字符串：

std::string::size_type total = std::accumulate(foo.begin(), foo.end(), 0u, 
                [](std::string::size_type c, std::string const& s) {
                    return c+s.size() 
                });
std::string result;
result.reserve(total);
std::for_each(foo.begin(), foo.end(), 
              [&](std::string const& s) { result += s; });

这个的常见替代方法是使用表达式模板，但这不适合答案。基本上，您创建一个映射操作的数据结构，但不执行它们。最终评估表达式时，它可以预先收集所需的信息，并使用它来保留空间并执行复制。使用表达式模板的代码更好，但更复杂。

Answer 2

尝试以下

res=std::accumulate(foo.begin(),foo.end(),res,
  [](string &rs, const string &arg) -> string & { return rs+=arg; });

在此电话会议之前，可能会有一种打电话的感觉

std::string::size_type n = std::accumulate( foo.begin(), foo.end(), 
   std::string::size_type( 0 ),
   [] ( std::string_size_type n, const std::string &s ) { return ( n += s.size() ); } );

res.reserve( n );

Answer 3

有效地使用std::accumulate而没有任何冗余副本并不明显除了被重新分配并传入和传出lambda之外，累积值可能会被实现内部复制。
另请注意，std::accumulate() itself采用初始值按值，调用copy-ctor，因此忽略对副本源执行的任何reserve()（如建议的那样）在其他一些答案中）。

我发现连接字符串的最有效方法如下：

std::vector<std::string> str_vec{"foo","bar"};

// get reserve size:
auto sz = std::accumulate(str_vec.cbegin(), str_vec.cend(), std::string::size_type(0), [](int sz, auto const& str) { return sz + str.size() + 1; });

std::string res;
res.reserve(sz);
std::accumulate(str_vec.cbegin(), str_vec.cend(),
   std::ref(res), // use a ref wrapper to keep same object with capacity
   [](std::string& a, std::string const& b) -> std::string& // must specify return type because cannot return `std::reference_wrapper<std::string>`.
{                                                           // can't use `auto&` args for the same reason
   a += b;
   return a;
});

结果将在res 此实现具有 no 冗余副本，移动或重新分配。

Answer 4

这有点棘手，因为涉及两个操作，添加和分配。为了避免副本，您必须同时修改和中的字符串确保作业是无操作的。这是第二部分这很棘手。

我偶尔做的是创建一个自定义的“累加器”，沿着这条线：

class Accu
{
    std::string myCollector;
    enum DummyToSuppressAsgn { dummy };
public:
    Accu( std::string const& startingValue = std::string() )
        : myCollector( startingValue )
    {
    }
    //  Default copy ctor and copy asgn are OK.
    //  On the other hand, we need the following special operators
    Accu& operator=( DummyToSuppressAsgn )
    {
        //  Don't do anything...
        return *this;
    }
    DummyToSuppressAsgn operator+( std::string const& other )
    {
        myCollector += other;
        return dummy;
    }
    //  And to get the final results...
    operator std::string() const
    {
        return myCollector;
    }
};

调用accumulate时，会有一些副本返回值，但在实际积累期间，没有。只是调用

std::string results = std::accumulate( foo.begin(), foo.end(), Accu() );

（如果您真的关心性能，可以添加 Accu的构造函数的容量参数，以便它可以在成员字符串上执行reserve。如果我这样做，我会可能也会手写复制构造函数，以确保这一点复制对象中的字符串具有所需的容量。）

高效积累

4 个答案: