对象搜索部分数组

时间:2014-10-26 09:39:43

标签: c++ algorithm search binary-search

我有这样的对象:

class search_object
{
public:
    unsigned int index; // 0 <= index <= 50000
    unsigned int search_field; // 1 <= search_field <= 5000000000, can be duplicates!
};

我有大约50000个这样的对象。这些对象按索引排序。

我的程序获得如下搜索查询:&#34;是否有一些对象,其索引在left_indexright_index之间(left_index <= index <= right_index)并且有search_field eqaul到Numbersearch_field == Number)。

大约有50000次查询。

我有解决方案,但对我的上下文系统来说速度很慢。

我的算法是:

  1. search_field排序搜索对象。
  2. 查找lower_index,其中search_object[lower_index] = Numberlower_bound()函数,它是二进制搜索)
  3. 迭代从lower_index到数组末尾的对象数组。如果this_objectindexleft_index之间有right_index,那么true。否则,false
  4. 对所有搜索查询(left_index,right_index和Number)重复步骤2-3

3 个答案:

答案 0 :(得分:0)

我会使用标准容器,我不会将对象本身用作键。代码示例:

std::map<decltype(search_object::index), search_object> container;

...

auto itr = container.lower_bound(left_index);

while (itr != container.end() && itr->first <= right_index)
  if (itr->second == Number) return itr;

return container.end();

答案 1 :(得分:0)

您可以使用map<unsigned int, list<search_object*>>来保存每个search_field的对象列表。

答案 2 :(得分:0)

在搜索之前使用正确的排序策略可以轻松完成。

#include <iostream>     // std::cout
#include <algorithm>    // std::lower_bound, std::upper_bound, std::sort
#include <vector>       // std::vector

class search_object
{
public:
    uint64_t index; // 0 <= index <= 50000
    uint64_t search_field; // 1 <= search_field <= 5000000000, can be duplicates!
};

search_object data[] = { { 13, 54345632 }, { 42, 4645347 }, { 63, 4645347 }, { 117, 4674534536 } };

using table = std::vector<search_object>;
using itr = table::const_iterator;
using range = std::pair<itr, itr>;

template<typename Pred>
range FindRange(const table& vec, Pred pred, search_object lowValue, search_object highValue) {
  // concept vec is sorted by pred
  //assert(pred(lowValue, highValue)); // paranoid check

  itr low=std::lower_bound (vec.begin(), vec.end(), lowValue, pred); //
  itr up= std::upper_bound (low,         vec.end(), highValue, pred); // no reason to search before low!

  return range(low, up);
}

int main () {
  std::vector<search_object> FldTable(std::begin(data),std::end(data));

  // sort by primary key field and secondary index.
  auto CmpFld = [] (const search_object& obj1, const search_object& obj2) {
    return obj1.search_field < obj2.search_field || // primary sort
           ((obj1.search_field == obj2.search_field) && // secondary
        (obj1.index < obj2.index)
       );
  };

  // sort after field
  std::sort (FldTable.begin(), FldTable.end(), CmpFld); // dublicates possible.

  // some "random" search values
  unsigned int lowSearchIndex = 10, highSearchIndex = 100, searchField = 4645347;

  search_object low { lowSearchIndex,  searchField };
  search_object up  { highSearchIndex, searchField };
  range search = FindRange(FldTable, CmpFld, low, up);

  for (itr record = search.first; record < search.second; ++record)
     std::cout << "index = " << record->index << " field = " << record-> search_field << "\n";

  return 0;
}

输出

index = 42 field = 4645347
index = 63 field = 4645347
  
    

的std :: UPPER_BOUND     返回指向范围[first,last]中第一个元素的迭代器,它比较大于val。