具有boost变体单一访问者与多访问者与动态多态性的静态多态性

时间:2016-05-13 18:45:49

标签: c++ performance templates boost polymorphism

我正在比较以下C ++多态性方法的性能:

方法[1]。静态多态性使用boost变体,每个方法都有一个单独的访问者 方法[2]。静态多态性使用boost变体与单个访问者使用方法重载调用不同的方法 方法[3]。普通的旧动态多态性

平台: - Intel x86 64位Red Hat现代多核处理器,32 GB RAM - 具有-O2优化的gcc(GCC)4.8.1 - 提升1.6.0

一些调查结果:

  • 方法[1]似乎具有明显优于方法[2]和[3]
  • 方法[3]大部分时间都优于方法[2]

我的问题是,为什么我使用访问者但使用方法重载调用正确方法的方法[2]比虚拟方法提供更差的性能。我希望静态多态性比动态多态更好。我理解在方法[2]中传递的额外参数有一些成本要计算要调用的类的visit()方法,以及由于方法重载可能会有更多的分支?但是,这不应该优于虚拟方法吗?

代码如下:

// qcpptest.hpp

#ifndef INCLUDED_QCPPTEST_H
#define INCLUDED_QCPPTEST_H

#include <boost/variant.hpp>

class IShape {
 public:
  virtual void rotate() = 0;
  virtual void spin() = 0;
};

class Square : public IShape {
 public:
  void rotate() {
   // std::cout << "Square:I am rotating" << std::endl;
    }
  void spin() { 
    // std::cout << "Square:I am spinning" << std::endl; 
  }
};

class Circle : public IShape {
 public:
  void rotate() { 
    // std::cout << "Circle:I am rotating" << std::endl; 
  }
  void spin() {
   // std::cout << "Circle:I am spinning" << std::endl; 
}
};

// template variation

// enum class M {ADD, DEL};
struct ADD {};
struct DEL {};

class TSquare {
    int i;
 public:
    void visit(const ADD& add) {
        this->i++;
    // std::cout << "TSquare:I am rotating" << std::endl;
  }
    void visit(const DEL& del) {
        this->i++;
    // std::cout << "TSquare:I am spinning" << std::endl;
  }

    void spin() {
        this->i++;
     // std::cout << "TSquare:I am rotating" << std::endl; 
 }
    void rotate() {
        this->i++;
     // std::cout << "TSquare:I am spinning" << std::endl; 
 }
};

class TCircle {
    int i;
 public:
    void visit(const ADD& add) {
        this->i++;
    // std::cout << "TCircle:I am rotating" << std::endl;
  }
    void visit(const DEL& del) {
        this->i++;
    // std::cout << "TCircle:I am spinning" << std::endl;
  }
    void spin() { 
        this->i++;
        // std::cout << "TSquare:I am rotating" << std::endl; 
    }
    void rotate() {
    this->i++; 
        // std::cout << "TSquare:I am spinning" << std::endl; 
    }
};

class MultiVisitor : public boost::static_visitor<void> {
 public:
  template <typename T, typename U>

    void operator()(T& t, const U& u) {
    // std::cout << "visit" << std::endl;
    t.visit(u);
  }
};

// separate visitors, single dispatch

class RotateVisitor : public boost::static_visitor<void> {
 public:
  template <class T>
  void operator()(T& x) {
    x.rotate();
  }
};

class SpinVisitor : public boost::static_visitor<void> {
 public:
  template <class T>
  void operator()(T& x) {
    x.spin();
  }
};

#endif

// qcpptest.cpp

#include <iostream>
#include "qcpptest.hpp"
#include <vector>
#include <boost/chrono.hpp>

using MV = boost::variant<ADD, DEL>;
// MV const add = M::ADD;
// MV const del = M::DEL;
static MV const add = ADD();
static MV const del = DEL();

void make_virtual_shapes(int iters) {
  // std::cout << "make_virtual_shapes" << std::endl;
  std::vector<IShape*> shapes;
  shapes.push_back(new Square());
  shapes.push_back(new Circle());

  boost::chrono::high_resolution_clock::time_point start =
      boost::chrono::high_resolution_clock::now();

  for (int i = 0; i < iters; i++) {
    for (IShape* shape : shapes) {
      shape->rotate();
      shape->spin();
    }
  }

  boost::chrono::nanoseconds nanos =
      boost::chrono::high_resolution_clock::now() - start;
  std::cout << "make_virtual_shapes took " << nanos.count() * 1e-6
            << " millis\n";
}

void make_template_shapes(int iters) {
  // std::cout << "make_template_shapes" << std::endl;
  using TShapes = boost::variant<TSquare, TCircle>;
  // using MV = boost::variant< M >;

  // xyz
  std::vector<TShapes> tshapes;
  tshapes.push_back(TSquare());
  tshapes.push_back(TCircle());
  MultiVisitor mv;

  boost::chrono::high_resolution_clock::time_point start =
      boost::chrono::high_resolution_clock::now();

  for (int i = 0; i < iters; i++) {
    for (TShapes& shape : tshapes) {
      boost::apply_visitor(mv, shape, add);
      boost::apply_visitor(mv, shape, del);
      // boost::apply_visitor(sv, shape);
    }
  }
  boost::chrono::nanoseconds nanos =
      boost::chrono::high_resolution_clock::now() - start;
  std::cout << "make_template_shapes took " << nanos.count() * 1e-6
            << " millis\n";
}

void make_template_shapes_single(int iters) {
  // std::cout << "make_template_shapes_single" << std::endl;
  using TShapes = boost::variant<TSquare, TCircle>;
  // xyz
  std::vector<TShapes> tshapes;
  tshapes.push_back(TSquare());
  tshapes.push_back(TCircle());
  SpinVisitor sv;
  RotateVisitor rv;

  boost::chrono::high_resolution_clock::time_point start =
      boost::chrono::high_resolution_clock::now();

  for (int i = 0; i < iters; i++) {
    for (TShapes& shape : tshapes) {
      boost::apply_visitor(rv, shape);
      boost::apply_visitor(sv, shape);
    }
  }
  boost::chrono::nanoseconds nanos =
      boost::chrono::high_resolution_clock::now() - start;
  std::cout << "make_template_shapes_single took " << nanos.count() * 1e-6
            << " millis\n";
}

int main(int argc, const char* argv[]) {
  std::cout << "Hello, cmake" << std::endl;

  int iters = atoi(argv[1]);

  make_virtual_shapes(iters);
  make_template_shapes(iters);
  make_template_shapes_single(iters);

  return 0;
}

1 个答案:

答案 0 :(得分:6)

方法2基本上是无效地重新实现动态调度。当你有:

shape->rotate();
shape->spin();

这涉及在vtable中查找正确的函数并调用它。查找效率低下。但是当你有:

boost::apply_visitor(mv, shape, add);

大致爆炸(假设add<>成员函数模板只是reinterpret_cast而没有检查):

if (shape.which() == 0) {
    if (add.which() == 0) {
        mv(shape.as<TSquare&>(), add.as<ADD&>());
    }
    else if (add.which() == 1) {
        mv(shape.as<TSquare&>(), add.as<DEL&>());
    }
    else {
        // ???
    }
}
else if (shape.which() == 1) {
    if (add.which() == 0) {
        mv(shape.as<TCircle&>(), add.as<ADD&>());
    }
    else if (add.which() == 1) {
        mv(shape.as<TCircle&>(), add.as<DEL&>());
    }
    else {
        // ???
    }
}
else {
   // ???
}

在这里,我们有分支的组合爆炸(我们在方法1中没有这样做)但我们实际上必须检查每个变体的每个可能的静态类型以找出我们必须做的事情(我们没有方法3中不得不这样做。而且这些分支将无法被预测,因为你每次都使用不同的分支,所以你不能在没有匆忙停止的情况下管理任何类型的代码。

mv()上的重载是免费的 - 它是在弄清楚我们正在调用的内容mv。还要注意基于更改两个轴中的任何一个而发生的增量时间:

+---------------+----------------+----------------+----------+
|               |    Method 1    |    Method 2    | Method 3 |
+---------------+----------------+----------------+----------+
|    New Type   | More Expensive | More Expensive |   Free   |
| New Operation |      Free      | More Expensive |   Free*  |
+---------------+----------------+----------------+----------+

方法1在添加新类型时变得更加昂贵,因为我们必须显式迭代所有类型。添加新操作是免费的,因为操作无关紧要。

方法3可以自由添加新类型并且可以自由添加新操作 - 唯一的变化就是增加vtable。由于对象大小,这将产生一些影响,但通常会比增加的类型迭代小。