MySQL的MD5哈希函数返回的值是否会无限期地继续变化,因为给它的字符串无限增长?
,例如,这些将继续返回不同的值:
MD5("A"+"B"+"C")
MD5("A"+"B"+"C"+"D")
MD5("A"+"B"+"C"+"D"+"E")
MD5("A"+"B"+"C"+"D"+"E"+"D")
... and so on until a very long list of values ....
在某些时候,当我们给函数提供非常长的输入字符串时,结果是否会停止变化,就像输入被截断一样?
我问,因为我想使用MD5函数通过存储这些字段的MD5哈希来比较两个记录和一大组字段。
========制作示例(你不需要回答这个问题但是你可能有兴趣:========
我有一个数据库应用程序,它定期从外部源获取数据并使用它来更新MySQL表。
让我们假设在第1个月,我首次下载:
downloaded data, where the first field is an ID, a key:
1,"A","B","C"
2,"A","D","E"
3,"B","D","E"
I store this
1,"A","B","C"
2,"A","D","E"
3,"B","D","E"
第二个月,我明白了 1, “A”, “B”, “C” 2, “A”, “d”, “X” 3, “B”, “d”, “E” 4, “B”, “F”, “E”
Notice that the record with ID 2 has changed. Record with ID 4 is new. So I store two new records:
1,"A","B","C"
2,"A","D","E"
3,"B","D","E"
2,"A","D","X"
4,"B","F","E"
This way I have a history of *changes* to the data.
I don't want have to compare each field of the incoming data with each field of each of the stored records.
E.g., if I'm comparing incoming record x with exiting record a, I don't want to have to say:
Add record x to the stored data if there is no record a such that x.ID == a.ID AND x.F1 == a.F1 AND x.F2 == a.F2 AND x.F3 == a.F3 [4 comparisons]
What I want to do is to compute an MD5 hash and store it:
1,"A","B","C",MD5("A"+"B"+"C")
Let's suppose that it is month #3, and I get a record:
1,"A","G","C"
What I want to do is compute the MD5 hash of the new fields: MD5("A"+"G"+"C") and compare the resulting hash with the hashes in the stored data.
If it doesn't match, then I add it as a new record.
I.e., Add record x to the stored data if there is no record a such that x.ID == a.ID AND MD5(x.F1 + x.F2 + x.F3) == a.stored_MD5_value [2 comparisons]
My question is "Can I compare the MD5 hash of, say, 50 fields without increasing the likelihood of clashes?"
答案 0 :(得分:1)
是的,实际上,它应该不断变化。由于pigeonhole principle,如果你继续这样做,你最终应该发生碰撞,但是你达到这一点是不切实际的。
答案 1 :(得分:1)
MD5哈希函数的安全性严重受损。存在碰撞攻击,可以在具有2.6Ghz Pentium4处理器的计算机上发现碰撞(复杂性为2 24 )。 此外,还存在一种选择前缀冲突攻击,它可以使用现成的计算硬件(复杂性2 39 )在几小时内为两个选择的任意不同输入产生冲突。 通过使用现成的GPU,极大地帮助了发现冲突的能力。在NVIDIA GeForce 8400GS图形处理器上,可以计算出每秒16-18百万个哈希值。 NVIDIA GeForce 8800 Ultra每秒可以计算超过2亿个哈希值。
这些哈希和冲突攻击已在各种情况下在公众中得到证明,包括碰撞文档文件和数字证书。 见http://www.win.tue.nl/hashclash/On%20Collisions%20for%20MD5%20-%20M.M.J.%20Stevens.pdf
许多项目已在线发布MD5彩虹表,可用于将许多MD5哈希值反转为与原始输入冲突的字符串,通常用于密码破解。