Question

考虑一下这个程序，我正在使用gcc 5.4.0和命令行g++ -std=c++14 -Wall -pedantic -O2 timing.cpp -o timing在Cygwin上编译。

#include <chrono>
#include <iostream>
#include <string>
#include <vector>

std::string generateitem()
{
    return "a";
}

int main()
{
    std::vector<std::string> items;

    std::chrono::steady_clock clk;
    auto start(clk.now());

    std::string item;
    for (int i = 0; i < 3000000; ++i)
    {
        item = generateitem();
        items.push_back(item); // *********
    }

    auto stop(clk.now());
    std::cout
        << std::chrono::duration_cast<std::chrono::milliseconds>
            (stop-start).count()
        << " ms\n";
}

我一直报告的时间约为500毫秒。但是，如果我注释掉已加星标的行，从而省略push_back到vector，则报告的时间大约为700毫秒。

为什么不推动vector使循环运行得更慢？

Answer 1

我现在正在运行测试，问题是在<div class="container"> <div class="help"> <h1>Nav Bar</h1> </div> <div class="title"> <h1>Site Title</h1> </div> </div>版本中，push_back字符串未被释放。将代码更改为：

item

在我的CygWin机器上，两个选项的预期行为几乎相同，因为我们这次测量了所有的解除分配。

进一步解释，原始代码基本上是：

#include <chrono>
#include <iostream>
#include <string>
#include <vector>

std::string generateitem()
{
    return "a";
}

int main()
{

    std::chrono::steady_clock clk;
    auto start(clk.now());
{
    std::vector<std::string> items;
    std::string item;
    for (int i = 0; i < 3000000; ++i)
    {
        item = generateitem();
        items.push_back(item); // *********
    }
}
    auto stop(clk.now());
    std::cout
        << std::chrono::duration_cast<std::chrono::milliseconds>
            (stop-start).count()
        << " ms\n";
}

因此，表现主要是3000000分配。现在，如果我们注释掉allocate items start clock repeat 3000000 times allocate std::string("a") move std::string("a") to end of items array stop clock deallocate 3000000 strings，我们会得到：

push_back()

现在我们测量3000000个分配和3000000个解除分配，所以很明显它实际上会更慢。我建议将allocate items start clock repeat 3000000 times allocate std::string("a") deallocate std::string("a") stop clock向量释放移动到时间跨度意味着我们要么items：

push_back()

或没有start clock allocate items repeat 3000000 times allocate std::string("a") move std::string("a") to end of items array deallocate 3000000 strings stop clock：

push_back()

因此，在我们测量3000000分配和解除分配的两种方式中，所以代码将花费基本相同的时间。

Answer 2

感谢Ken Y-N的回答，我现在可以回答我自己的问题。

代码是在标准库的一个版本中再次编译的，该库实现了std::string的写时复制。也就是说，当复制字符串时，字符串内容的缓冲区不会重复，并且两个字符串对象都使用相同的缓冲区。只有在写入其中一个字符串时才会发生复制。因此，分配的字符串缓冲区的生命周期如下：

它是在generateitem函数中创建的。
它来自generateitem功能，通过RVO。
分配给item。（这是移动操作，因为std::string是临时的。）
对push_back的调用会复制std::string，但不会复制缓冲区。现在有两个std::string共享一个缓冲区。
在循环的下一次迭代中，下一个字符串被移动到item。现在，使用缓冲区的唯一std::string对象是向量中的对象。
在main完成时销毁向量时，所有缓冲区的引用计数降为0，因此它们将被释放。

因此，在测量的时间内没有释放缓冲区。

如果我们取消对push_back的调用，则第4步不会发生。在步骤5中，缓冲区的引用计数然后降至0，因此在测量的时间内将其解除分配。这解释了测量时间增加的原因。

现在，根据documentation，GCC 5应该已经用一个不使用copy-on-write的新版本替换了写时复制字符串类。但是，显然，旧版本仍然在Cygwin中默认使用。如果我们在命令行中添加-D_GLIBCXX_USE_CXX11_ABI=1，我们将获得新的字符串类，并且我们可以得到我们期望的结果。

为什么省略push_back会使循环运行得更慢？

2 个答案: