Question

我正在尝试一些已知的算法，该算法旨在减少在未排序数组中查找元素的操作中的比较次数。该算法使用添加到数组背面的sentinel，它允许编写一个循环，我们只使用一个比较，而不是两个。值得注意的是，整体Big O计算复杂度没有改变，它仍然是O（n）。但是，在查看比较次数时，标准查找算法就是O（2n），而标记算法是O（n）。

c ++库中的标准查找算法的工作原理如下：

template<class InputIt, class T>
InputIt find(InputIt first, InputIt last, const T& value)
{
    for (; first != last; ++first) {
        if (*first == value) {
            return first;
        }
    }
    return last;
}

我们可以看到两个比较和一个增量。

在带有Sentinel的算法中，循环如下所示：

while (a[i] != key)
      ++i;

只有一个比较和一个增量。

我做了一些实验并测量了时间，但在每台计算机上的结果都不同。不幸的是我没有访问任何严肃的机器，我只有我的笔记本电脑与VirtualBox，Ubuntu，我编译和运行代码。我的内存量有问题。我尝试使用Wandbox和Ideone等在线编译器，但时间限制和内存限制不允许我进行可靠的实验。但每次我运行代码，更改向量中的元素数量或更改测试的执行次数时，我都会看到不同的结果。有时候时间是可比的，有时std :: find明显更快，有时候哨兵算法明显更快。

我感到很惊讶，因为逻辑说哨兵版本确实应该更快，每次都更快。你对此有什么解释吗？你有这种算法的经验吗？当性能至关重要且无法对数组进行排序时（甚至无法使用任何其他解决此问题的机制，如hashmap，indexing等），是否仍然努力尝试在生产代码中使用它？

这是我的测试代码。它不漂亮，实际上它很难看，但美丽不是我的目标。我的代码可能有问题吗？

#include <iostream>
#include <algorithm>
#include <chrono>
#include <vector>
using namespace std::chrono;
using namespace std;

const unsigned long long N = 300000000U;

static void find_with_sentinel()
{
   vector<char> a(N);
   char key = 1;
   a[N - 2] = key; // make sure the searched element is in the array at the last but one index

   unsigned long long high = N - 1;
   auto tmp = a[high];

   // put a sentinel at the end of the array
   a[high] = key;

   unsigned long long i = 0;
   while (a[i] != key)
      ++i;

   // restore original value
   a[high] = tmp;

   if (i == high && key != tmp)
      cout << "find with sentinel, not found" << endl;
   else
      cout << "find with sentinel, found" << endl;
}

static void find_with_std_find()
{
   vector<char> a(N);
   int key = 1;
   a[N - 2] = key; // make sure the searched element is in the array at the last but one index

   auto pos = find(begin(a), end(a), key);
   if (pos != end(a))
      cout << "find with std::find, found" << endl;
   else
      cout << "find with sentinel, not found" << endl;
}

int main()
{
   const int times = 10;
   high_resolution_clock::time_point t1 = high_resolution_clock::now();
   for (auto i = 0; i < times; ++i)
      find_with_std_find();
   high_resolution_clock::time_point t2 = high_resolution_clock::now();
   auto duration = duration_cast<milliseconds>(t2 - t1).count();
   cout << "std::find time = " << duration << endl;

   t1 = high_resolution_clock::now();
   for (auto i = 0; i < times; ++i)
      find_with_sentinel();
   t2 = high_resolution_clock::now();
   duration = duration_cast<milliseconds>(t2 - t1).count();
   cout << "sentinel time = " << duration << endl;
}

Answer 1

将内存分配（向量构造）移到测量函数之外（例如，将向量作为参数传递）。
将times增加到几千。

Answer 2

你在功能上做了很多耗时的工作。这项工作隐藏了时间上的差异。考虑一下您的find_with_sentinel功能：

static void find_with_sentinel()
{
   // ***************************
   vector<char> a(N);
   char key = 1;
   a[N - 2] = key; // make sure the searched element is in the array at the last but one index
   // ***************************

   unsigned long long high = N - 1;
   auto tmp = a[high];

   // put a sentinel at the end of the array
   a[high] = key;

   unsigned long long i = 0;
   while (a[i] != key)
      ++i;

   // restore original value
   a[high] = tmp;

   // ***************************************  
   if (i == high && key != tmp)
      cout << "find with sentinel, not found" << endl;
   else
      cout << "find with sentinel, found" << endl;
   // **************************************
}

顶部的三条线和底部的四条线在两个功能中都是相同的，并且运行起来相当昂贵。 top包含内存分配，bottom包含昂贵的输出操作。这些将掩盖完成函数实际工作所需的时间。

您需要将分配和输出移出函数。将函数签名更改为：

static int find_with_sentinel(vector<char> a, char key);

换句话说，使其与std::find相同。如果你这样做，那么你就不必包裹std::find，并且可以更真实地了解你的功能在典型情况下的表现。

哨兵发现功能很可能会更快。但是，它有一些缺点。首先，您不能将它与不可变列表一起使用。第二个是在多线程程序中使用它是不安全的，因为一个线程可能会覆盖另一个线程正在使用的标记。它也可能没有“足够快”来证明替换std::find。

使用标记

2 个答案: