Question

问题域

我有一个（可能）长的数据配对列表，我需要合并（并执行一些逻辑），这样就没有重复。配对属于int类型，但由于数据量的增加，我将其转换为size_t的配对，因此我的数据类型现在声明为pair<size_t, size_t>

以前通过hash_set检查唯一性的代码，然后探究它以查看是否已经看到并处理了给定的配对。作为密钥，它方便地使用INT64并使用位移和打包构建密钥：

INT64 key = ((INT64)pairsListEntry->first) << 32 | pairsListEntry->second;

这很有效，因为两个int完全适合INT64并产生唯一键。但由于显而易见的原因，这不再起作用了。

即时问题

为了调整新尺寸，我尝试重构并将hash_set声明为：

std::hash_set<pair<size_t, size_t>> m_seenPairs;

但是，在编译使用以下错误消息创建该类实例的代码时，这会失败：

C：\ Program Files（x86）\ Microsoft Visual Studio 9.0 \ VC \ include \ xhash（71）：错误C2440：'type cast'：无法转换为'const std :: pair＆lt; _Ty1，_Ty2＆gt;'到'size_t'

这发生在STL的实现深处，在以下函数中（在return行上）：

template<class _Kty> inline
size_t hash_value(const _Kty& _Keyval)
{   // hash _Keyval to size_t value one-to-one
return ((size_t)_Keyval ^ _HASH_SEED);
}

原因非常清楚：pair<T1, T2>不知道如何转换为size_t以计算哈希码。

此时，我仍然坚持如何让它发挥作用。 Google-foo并没有太大的吸引力。我在std::map和pair看了几个关于SO的帖子，但似乎“只是工作”。

环境是VS2008，x64平台非托管目标。

我尝试过的事情

我尝试提供自己的比较器，因为我看到a post that looked at least barely remotely similar如下：

struct pairs_equal_compare
{
    bool operator()(const pair<SampleIdIndex_t, SampleIdIndex_t> & p1, const pair<SampleIdIndex_t, SampleIdIndex_t> & p2) const
    {
        return (p1.first == p2.first) && (p1.second == p2.second);
    }
};

// Holds a set of pairs that are known to exist for deduplication purposes.
stdext::hash_set<pair<SampleIdIndex_t, SampleIdIndex_t>, 
    stdext::hash_compare<pair<SampleIdIndex_t, SampleIdIndex_t>, pairs_equal_compare>> m_seenPairs;

这（当我得到声明和结构正确声明时）导致完全相同的错误 - 现在意识到它实际上没有帮助绕过内部调用hash_value来计算哈希码。

我还尝试使用pairs_equal_compare代替hash_compare，但这会产生更多的编译错误，看起来像是一个错误的方向......

似乎必须有一种合理的方法让hash_set能够处理pair（或任何非整数类型的数据），但我不知道如何实现这一点。

Answer 1

开箱即用，stdext::hash_set<>仅适用于可隐式转换为size_t的类型。对于std::pair<>，您需要提供stdext::hash_compare<>（对于stdext::hash_set<>＆＃39; Traits参数）的参数，其行为与std::pair<>相同，因为{{ 1}}本身没有。

以下适用于我的VS2013，我不明白为什么它也不适用于VS2008：

#include <cstddef>
#include <utility>
#include <hash_set>

struct pair_hasher
{
    typedef std::pair<std::size_t, std::size_t> value_type;

    value_type value;

    pair_hasher(value_type const& v) : value(v) { }

    operator std::size_t() const
    {
        return (5381 * 33 ^ value.first) * 33 ^ value.second;
    }
};

bool operator <(pair_hasher const& a, pair_hasher const& b)
{
    return a.value < b.value;
}

然后您需要声明您的stdext::hash_set<>实例：

stdext::hash_set<
    std::pair<std::size_t, std::size_t>,
    stdext::hash_compare<pair_hasher>
> s;

对于std::pair<>的整数类型以外的类型，必要时更新pair_hasher::operator std::size_t（operator <应该没问题，只要std::pair<>中的类型本身已经具有可比性了。）

Answer 2

您还可以使用行为类似于Traits的合适hash_compare对象，即必须定义两个operator()：

size_t operator()(const Key &key) const; // This one returns the hash of key
bool operator()(const Key &first, 
                const Key &second) const; // This one returns true if first is less than second

和两个整数常量，您可以从默认实现中获取：

const size_t bucket_size = 4;
const size_t min_buckets = 8;

见documentation of hash_compare。

代码看起来像

struct pair_comparator{
    typedef std::pair<std::size_t, std::size_t> Key;
    size_t operator()(const Key &key) const { return /* your hash code here */; }
    bool operator()(const Key &first, 
                const Key &second) const { return first < second; }
    const size_t bucket_size = 4;
    const size_t min_buckets = 8;
};

stdext::hash_set<
    std::pair<std::size_t, std::size_t>,
    pair_comparator
> s;

编辑：文档说您也可以从hash_compare的专业化派生出来，只覆盖您不喜欢的成员，所以：

struct pair_comparator : public stdext::hash_compare<std::pair<std::size_t, std::size_t> >{
    typedef std::pair<std::size_t, std::size_t> Key;
    size_t operator()(const Key &key) const { return /* your hash code here */; }
    bool operator()(const Key &first, 
                const Key &second) const { return first < second; }
};

这应避免必须定义const int成员的问题。

如何获取std :: hash_set <pair <t1，t2 =“”>＆gt;编译并运行</pair <t1，>

问题域

即时问题

我尝试过的事情

2 个答案: