Question

有没有有效的方法来计算向量中每对的重复数？例如，如果我有这样的矢量：

 vector<pair<int, int> > duplicates={{1,2},{3,2},{2,1},{5,6},{5,6},{1,2},{2,1},{5,6}};

输出应为：

 {1,2}:2
 {3,2}:1
 {2,1}:2
 {5,6}:3

而且要清楚，我只是对如何更有效地解决这个问题感到好奇。我试图比较每一对这个向量，它似乎不是一个聪明的方式。

Answer 1

一种简单的方法是使用地图或无序地图来计算它们：

#include <iostream>
#include <vector>
#include <map>
int main( int argn, char **argc)
{
    std::vector<std::pair<int, int> > duplicates={{1,2},{3,2},{2,1},{5,6},{5,6},{1,2},{2,1},{5,6}};
    std::map<std::pair<int, int>, int> checker;
    for (const auto &elem: duplicates)
    {
        ++checker[elem];
    }

    for (const auto &elem: checker) std::cout << "{" << elem.first.first <<
                                                 "," << elem.first.second <<
                                                 "}: " << elem.second << std::endl;

    return 0;
}

请注意，地图插入/恢复是O（log（n）），并且循环使其成为aprox。为O（n *的log（n））

编辑：

遵循OP的附加说明，这是使用unordered_map的更好（更快）的实现：

#include <iostream>
#include <vector>
#include <unordered_map>

namespace std
{
template <>
struct hash<std::pair<int,int>>
{
    size_t operator()(pair<int, int> const &p) const
    {
        // Fine for 64bit size_t and 32bit int. Otherwise, some collision may happens.
        size_t result = (static_cast<size_t>(p.first) <<(sizeof(std::size_t)<<2))
                        + static_cast<size_t>(p.second);
        return result;
    }
};
}

int main( int argn, char **argc)
{
    std::vector<std::pair<int, int> > duplicates={{1,2},{3,2},{2,1},{5,6},{5,6},{1,2},{2,1},{5,6}};
    std::unordered_map<std::pair<int, int>, int> checker;
    for (const auto &elem: duplicates)
    {
        ++checker[elem]; // value initialized with 0
    }

    for (const auto &elem: checker) std::cout << "{" << elem.first.first <<
                                                 "," << elem.first.second <<
                                                 "}: " << elem.second << std::endl;

    return 0;
}

在unordered_map中插入，使用散列使其通常保持不变（更糟糕的情况是碰撞是线性的）。平均最终复杂度为O（N）

Answer 2

我有一个简单的解决方案：

排序对矢量
然后只是一个循环，如果匹配连续对，然后增加计数器

一般搜索复杂度：n * n
此搜索复杂性：nlog（n）

计算每对向量的重复数据的有效方法是什么？

2 个答案: