Question

我有一些数据存储在： std::vector<std::unique_ptr<std::pair<Key, Data>>>其中Data是一个大型对象，Key唯一标识Data。我们可以假设Key上没有重复项，此向量根据Key按升序排序。

我已按照以下方式（遵循标准容器）实施insert

bool compare(const std::unique_ptr<std::pair<Key,Data>>& elem,
             const std::unique_ptr<std::pair<Key,Data>>& input)
{
  return elem->first < input->first;
}

typedef std::vector<std::unique_ptr<std::pair<Key, Data>>> DataStore;

std::pair<DataStore::iterator, bool>
insert(DataStore& vec, const Key& k, const Data& d)
{
  using namespace std::placeholders; // for using bind

  // vvv-- This heap allocation seems unnecessary when element already exists.
  // seems mainly needed for lower_bound to work
  std::unique_ptr<std::pair<Key,Data>> valPtr(new std::pair<Key,Data>(k,d));

  auto location = std::lower_bound(std::begin(vec),
                                   std::end(vec), valPtr,
                                   std::bind(&compare, _1, _2));
  // exists, return element location 
  if(location != vec.end() && (*location)->first == k) {
    return std::make_pair(location, false);
  }

  // non-existing element, add it to the right location
  auto addedLocation = vec.emplace(location, std::move(valPtr));
  return std::make_pair(addedLocation, true);
}

有人可以建议在上面的评论位置insert中避免分配的方法吗？

我不想编写我自己的lower_bound / binary_search实现。

Answer 1

std::lower_bound并不要求我们要搜索的对象的类型T与元素的类型之间存在任何关系，除了我们必须能够将它们与cmp(element, value) ¹进行比较。我们可以利用这个事实：

std::pair<DataStore::iterator, bool>
insert(DataStore& vec, const Key& k, const Data& d)
{
  // Note: since you are using C++11, you can use lambdas rather than `std::bind`.
  // This simplifies your code and makes it easier for the compiler to optimize
  auto location = std::lower_bound(std::begin(vec), std::end(vec), k,
    [](const std::unique_ptr<std::pair<Key,Data>>& elem, const Key& key) {
      return elem->first < key;
    });

  // exists, return element location 
  if(location != vec.end() && (*location)->first == k) {
    return std::make_pair(location, false);
  }

  // non-existing element, add it to the right location
  auto addedLocation = vec.emplace(location, new std::pair<Key,Data>(k,d));
  return std::make_pair(addedLocation, true);
}

^{1：许多标准算法都是这样做的。它们的设计尽可能通用，因此它们对传入的类型的要求很少。}

Answer 2

幸运的是，std::lower_bound和类似的函数实际上并不要求仿函数采用相同类型的两个参数。

lower_bound期望一个仿函数，使得解除引用的迭代器可以用作第一个参数，并且要搜索的对象可以用作第二个参数：

auto location = std::lower_bound(
    v.begin(), v.end(),
    std::make_pair(k,d),
    [](const std::unique_ptr<std::pair<Key, Data>>& ptr,
       const std::pair<Key, Data>& val)
    { return ptr->first < val.first; } );

插入到排序向量<unique_ptr <pair <>＆gt;＆gt;时避免堆分配

2 个答案: