Question

我正在尝试实现boost :: multi_index应用程序并且性能非常糟糕：插入10,000个对象需要大约0.1秒，这是不可接受的。因此，当我查看文档并发现boost :: multi_index可以接受内存分配器作为最后一个参数但是当我试图实现自己时，我得到了很多编译错误。请帮我纠正。感谢。

struct order
{
    unsigned int    id;
    unsigned int    quantity;
    double          price;
};

struct id{};
struct price{};

typedef multi_index_container<
  order,
  indexed_by<
    hashed_unique<
      tag<id>,  BOOST_MULTI_INDEX_MEMBER(order, unsigned int, id)>,
    ordered_non_unique<
      tag<price>,BOOST_MULTI_INDEX_MEMBER(order ,double, price),
        std::less<double> >
  >,
  boost::object_pool<order>
> order_sell;

通常，编译器不喜欢boost :: object_pool的表达式作为order_sell声明中的分配器。

Answer 1

让我重申亚历山大的建议，即你要对自己的计划进行分析，以便了解问题的真正所在。我强烈怀疑Boost.MultiIndex本身可能会像你说的那么慢。以下程序测量创建order_sell容器（没有Boost.Pool）所花费的时间，用10,000个随机订单填充它并销毁它：

<强> Live Coliru Demo

#include <algorithm>
#include <array>
#include <chrono>
#include <numeric> 

std::chrono::high_resolution_clock::time_point measure_start,measure_pause;

template<typename F>
double measure(F f)
{
  using namespace std::chrono;

  static const int              num_trials=10;
  static const milliseconds     min_time_per_trial(200);
  std::array<double,num_trials> trials;
  volatile decltype(f())        res; /* to avoid optimizing f() away */

  for(int i=0;i<num_trials;++i){
    int                               runs=0;
    high_resolution_clock::time_point t2;

    measure_start=high_resolution_clock::now();
    do{
      res=f();
      ++runs;
      t2=high_resolution_clock::now();
    }while(t2-measure_start<min_time_per_trial);
    trials[i]=duration_cast<duration<double>>(t2-measure_start).count()/runs;
  }
  (void)res; /* var not used warn */

  std::sort(trials.begin(),trials.end());
  return std::accumulate(
    trials.begin()+2,trials.end()-2,0.0)/(trials.size()-4);
}

void pause_timing()
{
  measure_pause=std::chrono::high_resolution_clock::now();
}

void resume_timing()
{
  measure_start+=std::chrono::high_resolution_clock::now()-measure_pause;
}

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/member.hpp>

using namespace boost::multi_index;

struct order
{
    unsigned int    id;
    unsigned int    quantity;
    double          price;
};

struct id{};
struct price{};

typedef multi_index_container<
  order,
  indexed_by<
    hashed_unique<
      tag<id>,BOOST_MULTI_INDEX_MEMBER(order, unsigned int, id)>,
    ordered_non_unique<
      tag<price>,BOOST_MULTI_INDEX_MEMBER(order ,double, price),
        std::less<double> >
  >
> order_sell; 

#include <iostream>
#include <random>

int main()
{
  std::cout<<"Insertion of 10,000 random orders plus container cleanup\n";
  std::cout<<measure([](){
    order_sell os;
    std::mt19937                                gen{34862};
    std::uniform_int_distribution<unsigned int> uidist;
    std::uniform_real_distribution<double>      dbdist;

    for(unsigned int n=0;n<10000;++n){
      os.insert(order{uidist(gen),0,dbdist(gen)});
    }
    return os.size();
  })<<" seg.\n";
}

当使用Coliru使用的后端以-O3模式运行时，我们得到：

Insertion of 10,000 random orders plus container cleanup
0.00494657 seg.

我机器中的VS 2015发布模式（Intel Core i5-2520M @ 2.50GHz）产生：

Insertion of 10,000 random orders plus container cleanup
0.00492825 seg.

所以，这比你报告的快20倍左右，我在测量中包括容器破坏和随机数生成。

另外几点意见：

boost::object_pool未提供标准库为与容器的互操作性指定的分配器接口。您可能希望使用boost::pool_allocator代替（我已经玩了一下，但似乎并没有提高速度，但您的里程可能会有所不同）。
John的回答似乎暗示Boost.MultiIndex在某种意义上是次优的，它将节点与值或类似的东西分开分配。实际上，库在内存分配方面可以获得尽可能高的效率，而且你可以用Boost.Intrusive做得更好（实际上你可以得到相同的）。如果您对Boost.MultiIndex内部数据结构的外观感到好奇，请查看我的this answer。特别是，对于具有散列索引和有序索引的order_sell容器，每个值都进入自己的一个节点，另外还有一个单独的所谓桶数组（一个数组）指针的长度与元素的数量大致相同。对于基于节点的数据结构，你不可能做得更好（如果你想省去迭代器稳定性，你可以设计更多的内存效率方案）。

Answer 2

由于某些原因，你不能或不应该这样做。

首先，boost::object_pool存在性能问题：从中释放对象是O（N）。如果您想有效地执行此操作，则需要直接在boost::pool之上实现自己的分配器。原因是object_pool使用“有序免费”语义，您不希望它用于您的用例。有关此性能错误的更多详细信息，请参阅此处：https://svn.boost.org/trac/boost/ticket/3789

其次，multi_index_container实际上需要分配一些不同的东西，具体取决于您选择的索引。仅仅能够分配value_type，它需要分配树节点等是不够的。这使得它完全不适合与池分配器一起使用，因为池分配器通常假定单个类型的许多实例（或至少单一尺寸）。

如果你想要最好的表现，你可能需要“自己动手”。 Boost MIC和Boost Pool肯定不会很好地融合在一起。但另一个想法是使用性能更高的通用分配器，例如tcmalloc：http://goog-perftools.sourceforge.net/doc/tcmalloc.html

您可以考虑使用Boost Intrusive，它具有非常适合池化分配的容器。您可以在order类型中添加挂钩，以便将它们存储在有序和无序地图中，然后您可以在boost::pool中分配订单。

最后，由于您似乎正在存储财务数据，因此您应该知道使用double存储价格是危险的。有关详情，请参阅此处：Why not use Double or Float to represent currency?

Answer 3

您需要做的第一件事（如果遇到性能瓶颈） - 是要分析！

这可能会导致（并且可能会）分配不是最糟糕的事情。

如何使用boost :: object_pool作为boost :: multi_index分配器？

3 个答案: