Question

有一个大小为N的随机数组，找到出现次数超过N / 3的数字？例如：

{1,2,14,12,12,15,12,12,8} the result is 12

谁有更有效的算法？我是这样做的：

int getNum(int *arr, int left, int right, const int size)
{
    srand(time(0));
    int index = rand()%(right - left + 1) + left;
    std::swap(arr[left], arr[index]);
    int flag = arr[left];
    int small = left;
    int big = right;
    int equal = left;
    while(equal <= big)
    {
        if(arr[equal] == flag)
        {
            equal++;
        }
        else if(arr[equal] < flag)
        {
            swap(arr[equal++], arr[small++]);
        }
        else
        {
            while(big > equal && arr[big] > flag)
            {
                big--;
            }
            std::swap(arr[big], arr[equal]);
            big--;
        }
    }
    if(equal - small >= (size / 3))
    {
        return arr[small];
    }
    if(small - left >= size/3)
    {
        return getNum(arr, left, small - 1, size);
    }
    if(right - equal + 1 >= size/3)
    {
        return getNum(arr, equal, right, size);
    }
    else
    {
        return -1;
    }
}

首先，我定义三个等于大的标志，选择一个数字作为标志，然后找到这个数字的正确范围，当equal - small > size / 3时，这是我们找到的数字，否则找到大小超过size / 3的那边和递归！

Answer 1

实际上 - there is an algorithm proposed by Karp-Papadimitriou-Shanker通过一次传递查找数据中出现1/k次的项目。当然，它可以应用于k=3。

然而，该算法给出了误报（虽然事情不是很频繁，但事实并非如此） - 但是对于给定3个候选者的数据使用第二遍，这些可以很容易地消除。

算法如下：

PF = {}
for each element e:
  if pf.containsKey(e): 
     pf.put(e, pf.get(e)+1) //increase the value by 1
  else:
     pf.put(e,1)
     if pf.size() == k:
         for each key in pf:
              pf.put(key, pf.get(key)-1) //decrease all elements by 1
              if pf.get(key) == 0: //remove elements with value 0
                 pf.remove(key)
output pf

有关上述算法的更多信息和证明可以在this page，幻灯片8-12

中找到

即使进行了第二次传递，算法的复杂性为O(n)时O(k)（在您的情况下为k==3）额外空间。

Answer 2

另一种（概率）算法 - 选择数组中的50个随机值。

选择此数组中出现次数最多的值，并检查它是否符合原始数组中的条件（此操作为O(1)，因为50为常量）。这将是第一次有99％的机会。但是如果它失败了 - 从小（50个元素）数组中获取第二个值并尝试它。继续这样。总体复杂度为O(n)，但如果可能没有符合原始数组中标准的值，则需要修改此方法。

Answer 3

我的解决方案是对元素进行排序，如果索引i + N / 3-1处的元素等于索引i处的元素，则此元素至少出现N / 3次。

#include <stdio.h>

int compar(const void *a, const void *b) {
    return (*(int*)a) - (*(int*)b);
}

int main() {
    int N = 9;
    int N3 = N / 3;
    int tab[] = {1,2,14,12,12,15,12,12,8};

    qsort(tab, N, sizeof(int), compar);

    int i;
    for (i = 0; i <= N - N3; i++) {
        if (tab[i] == tab[i+N3-1]) {
            printf("%d\n", tab[i]);
        }
        while (tab[i] == tab[i+N3-1]) {
            i += N3 - 1;
        }
    }

    return 0;
}

复杂度为O（n log n）（因为排序）。如果表已经排序，那就是线性的。

Answer 4

我首次尝试解决此问题的方法是使用HashMap。代码如下：

public int O_N_Memory_Solution(final List<Integer> a){
    int repetationCount = a.size()/3;
    HashMap<Integer,Integer> map = new HashMap<Integer,Integer>();
    for(int i = 0 ; i < a.size() ; i++){
        if(map.containsKey(a.get(i)))   map.put(a.get(i),map.get(a.get(i))+1);
        else    map.put(a.get(i),1);
        if(map.get(a.get(i))>repetationCount) return a.get(i);
    }
    return -1;
}

这是一个小代码，但消耗更多的内存和时间。

我不知道有这个问题的算法，这是我对 Karp-Papadimitriou-Shanker算法 的实现。

该算法的主要思想是注意从阵列中删除K个不同元素不会改变答案。

这里的K等于3，我们试图在数组中找到任何超过n / 3次出现的元素。

// Karp-Papadimitriou-Shenker Algorithm
public int O_1_Memory_Solution(final List<Integer> a){
    if(a.size() == 0) return -1;

    int firstInt = 0, secondInt = 0;
    int firstCount = 0;
    int secondCount = 0;
    int current;

    for(int i = 0; i < a.size(); i++){

        current = a.get(i);

        // You should check 1st before setting so that, if one of the two integers is empty, 
        // you increment the non empty integer if the current matches it, not adding the current to
        // the empty one.

        if(current == firstInt && firstCount!=0) {
            firstCount++;
        } else if(current == secondInt && secondCount!=0) {
            secondCount++;
        } else if(firstCount == 0) {
            firstInt = current;
            firstCount = 1;
        } else if(secondCount == 0) {
            secondInt = current;
            secondCount = 1;
        } else {
            firstCount--;
            secondCount--;
        }

    }

    int repetationCount = a.size()/3;
    int[] candidates = {firstInt,secondInt};
    int ac;
    /* Check actual counts of potential candidates */
    for (int i = 0; i < candidates.length; i++) {
        // Calculate actual count of elements 
        ac = 0;  // actual count
        for (int j = 0; j < a.size(); j++)
            if (a.get(j) == candidates[i])
                ac++;

        // If actual count is more than n/k, then print it
        if (ac > repetationCount) return candidates[i];
    }

    return -1;
}

找到发生次数超过N / 3的数字

4 个答案: