Question

我写了一些代码来测试我的无序地图性能，并将2分量矢量作为关键。

std::unordered_map<Vector2i, int> m;                                                                      

for(int i = 0; i < 1000; ++i)                                                                             
    for(int j = 0; j < 1000; ++j)                                                                         
        m[Vector2i(i,j)] = i*j+27*j;                                                                      

clock.restart();                                                                                          

auto found = m.find(Vector2i(0,5));                                                                                                                                                            

std::cout << clock.getElapsedTime().asMicroseconds() << std::endl;

上面代码的输出：56（微秒）当我在for循环中将1000替换为100时，输出为2（微秒）时间应该不变吗？

我的Vector2i的哈希函数：

namespace std                                                                                                    
{

   template<>                                                                                                   
    struct hash<Vector2i>                                                                                        
    {                                                                                                            
        std::size_t operator()(const Vector2i& k) const                                                          
        {                                                                                                        
            using std::size_t;                                                                                   
            using std::hash;                                                                                     
            using std::string;                                                                                   

            return (hash<int>()(k.x)) ^ (hash<int>()(k.y) << 1);                                                 
        }                                                                                                        

    };                                                                                                           


}

修改我添加了这段代码来计算for循环后的碰撞：

for (size_t bucket = 0; bucket != m.bucket_count(); ++bucket)                                             
    if (m.bucket_size(bucket) > 1)                                                                        
         ++collisions;

100 * 100个元素：collisions = 256

1000 * 1000个元素：碰撞= 2048

Answer 1

哈希表保证constant amortized time。如果散列表很好地平衡（即，散列函数是好的），那么大多数元素将被均匀地分布。但是，如果散列函数不太好，则可能会发生大量冲突，在这种情况下，访问一个您需要遍历链接列表的元素（存储碰撞的元素）。因此，首先确保load factor和哈希函数在您的情况下是正常的。最后，确保在发布模式下编译代码，并启用优化（例如，对于g ++ / clang ++ -O3）。

此问题也可能有用：How to create a good hash_combine with 64 bit output (inspired by boost::hash_combine)。

为什么我的std :: unordered_map访问时间不是常量

1 个答案: