尝试运行this code。
第312行失败,并显示:
hashCode = coeffA[i] * shingleID + coeffB[i] % nextPrime
OverflowError: repeated bytes are too long
# For each shingle in the document...
for shingleID in shingleIDSet:
# Evaluate the hash function.
hashCode = (coeffA[i] * shingleID + coeffB[i]) % nextPrime
# Track the lowest hash code seen.
if hashCode < minHashCode:
minHashCode = hashCode
根本没有修改此代码,只是将其从Python 2转换为3。