Question

假设我有一个哈希算法，它很好而且流畅（任何一个哈希值的出现几率都与其他值相同）。

现在说我知道挑选2个哈希并且发生碰撞的几率是（为了论证）50000：1。

现在说我挑了100个哈希。考虑到一组2中碰撞的几率，如何计算该100个值内的碰撞几率？

对此的一般解决方案是什么，以便我可以提出一些哈希尝试，之后赔率低于某个可接受的阈值？例如。我可以说“一批49999哈希值创作有很高的碰撞机会”。

Answer 1

这是generalization of the Birthday problem。

Answer 2

这听起来很像我Birthday Paradox。

您应该能够用可能的哈希值（50000）替换可能的生日集（365）并运行它们在那里进行的相同计算。

如果你修改文章中提供的python脚本的值：

 def bp(n, d):
    v = 1.0
    for i in range(n):

         v = v * (1 - float(i)/d)
    return 1 - v

 print bp(2, 50000)

你最终得到两个0.00002的碰撞几率。大约265个样本，你有大约50％的机会发生碰撞。

Answer 3

首先计算没有碰撞的概率：

hashes_picked = 100
single_collision_odds = 50000

# safe_combinations is number of ways to pick hashes that don't overlap
safe_combinations = factorial(single_collision_odds) / factorial(single_collision_odds - hashes_picked)

# all_combinations is total number of ways to pick hashes
all_combinations = single_collision_odds ** hashes_picked   

collision_chance = (all_combinations - safe_combinations) / all_combinations

Answer 4

这称为the Birthday problem。要解决这个问题，请考虑一下 no 碰撞的可能性（称之为p _nc）。

p _nc（1）= 1
p _nc（2）= 1 - p _c（2）
p _nc（3）= p _nc（2）* p _nc（2）* p _nc（2）

Answer 5

和JS

function calculate(n,k)
{

    var result =1;
    for (var i=0; i<k; i++){
        result=result*n/(n-i)
    }
    result=(1-1/result)*100;
    return result;
}

如何计算哈希算法中碰撞的几率？

5 个答案: