Question

我目前正在学习C ++，我很好奇push_back()和emplace_back()如何在幕后工作。当你试图构造并将一个大对象推到容器的后面时，我总是假设emplace_back()更快，就像向量一样。

假设我有一个Student对象，我想要附加到学生矢量的背面。

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           /* initialize member variables */ { }
};

假设我调用push_back()并将Student对象推送到向量的末尾：

vector<Student> vec;
vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));

我的理解是push_back在向量之外创建Student对象的实例，然后将其移动到向量的后面。

图表：

我也可以安抚而不是推动：

vector<Student> vec;
vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);

我的理解是，Student对象是在向量的最后面构造的，因此不需要移动。

图表：

因此，放置更快会更有意义，特别是如果添加了许多Student对象。但是，当我计算这两个版本的代码时：

for (int i = 0; i < 10000000; ++i) {
    vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
}

和

for (int i = 0; i < 10000000; ++i) {
    vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
}

我期望后者更快，因为不需要移动大的Student对象。奇怪的是，emplace_back版本最终变慢（多次尝试）。我还尝试插入10000000个Student对象，其中构造函数接受引用，push_back()和emplace_back()中的参数存储在变量中。这也没有用，因为安慰仍然较慢。

我已经检查过以确保在两种情况下都插入了相同数量的对象。时差不是太大，但安息最终会慢几秒。

我对push_back()和emplace_back()的工作方式有何疑问？非常感谢你的时间！

这是代码，如要求的那样。我正在使用g ++编译器。

推回：

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
    return 0;
}

安抚回来：

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
    return 0;
}

Answer 1

此行为归因于std::string的复杂性。这里有几件事相互作用：

小字符串优化（SSO）
在push_back版本中，编译器能够在编译时确定字符串的长度，而对于emplace_back版本，编译器无法确定字符串的长度。因此，emplace_back调用需要对strlen的调用。此外，由于编译器不知道字符串文字的长度，因此它必须针对SSO和非SSO情况都发出代码（请参阅Jason Turner的"Initializer Lists Are Broken, Let's Fix Them"；这是一个漫长的讨论，但是他遵循了这个问题在整个向量中插入字符串的方法

考虑这种简单的类型：

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{a}
    , b{b}
    , c{c}
  {}
};

请注意，构造函数如何复制 a，b和c。

Testing this against a baseline of just allocating memory，我们可以看到push_back的表现胜过emplace_back：

^{Click on image for quick-bench link}

由于示例中的字符串都适合SSO缓冲区，因此在这种情况下，复制与移动一样便宜。因此，构造函数非常有效，并且emplace_back的改进效果较小。

此外，如果我们在the assembly中搜索对push_back的呼叫和对emplace_back的呼叫：

// push_back call
void foo(std::vector<type>& vec) {
    vec.push_back({"Bob", "pizza", "Smith"});
}

// emplace_back call
void foo(std::vector<type>& vec) {
    vec.emplace_back("Bob", "pizza", "Smith");
}

（程序集未在此处复制。它非常庞大。std::string很复杂）

我们可以看到emplace_back有对strlen的呼叫，而push_back没有。由于字符串文字和所构造的std::string之间的距离增加了，因此编译器无法优化对strlen的调用。

显式调用std::string构造函数将删除对strlen的调用，但是不再在适当的位置构造它们，因此无法加快emplace_back的运行。

所有这些，if we leave the SSO by using long enough strings，分配成本完全淹没了这些细节，因此emplace_back和push_back的性能相同：

^{Click on image for quick-bench link}

如果修复type的构造函数以移动其参数，则emplace_back在所有情况下都变得更快。

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{std::move(a)}
    , b{std::move(b)}
    , c{std::move(c)}
  {}
};

SSO case

^{Click on image for quick-bench link}

Long case

^{Click on image for quick-bench link}

但是，SSO push_back案件的速度放慢了；编译器似乎会发出额外的副本。

optimal version of perfect forwarding不受此缺点的影响（请注意垂直轴上的比例更改）：

struct type {
  std::string a;
  std::string b;
  std::string c;

  template <typename A, typename B, typename C>
  type(A&& a, B&& b, C&& c)
    : a{std::forward<A>(a)}
    , b{std::forward<B>(b)}
    , c{std::forward<C>(c)}
  {}
};

^{Click on image for quick-bench link}

幕后的push_back（）和emplace_back（）

1 个答案: