如何通过哈希比较2个字符串

时间:2018-11-03 20:02:42

标签: c++ string algorithm hash computer-science

如果我有2个字符串(1个针和1个干草堆),并且我想将干草堆中的所有子字符串(相同针长)散列与针串的哈希值进行比较。

示例:

na      << needle   , pHashes: 0 14 45 0 0 0 0 0 0  , needle hash = 45
banana  << haystack , pHashes: 0 2 33 13487 43278 12972572 41601723 0 0 
-----------------------------
^^      << 1st substring (ba)
 ^^     << 2nd substring (an)
  ^^    << 3rd substring (na)
   ^^   << 4th substring (an)
    ^^  << 5th substring (na)
我想计算干草堆字符串的所有前缀哈希值,并计算索引i和j的哈希值,以获得子字符串s [i…j]的哈希值。例如检查hash(needle)== hash(3rd substring)

我使用函数getHash来获取字符串哈希:

const long long hashMod = (long long) ( 1e15 + 9 );
const int prime = 31;

long long getHash(string str) {
    const int len = str.length();
    ll hash = 0, powVal = 1;
    for (int i = 0; i < len; ++i) {
        hash = ( hash + ( str[i] - 'a' + 1 ) * powVal ) % hashMod;
        ( powVal *= prime ) %= hashMod;
    }
    return hash;
}

和函数get(l , r)获取范围[l ... r]的哈希:

long long get(int l, int r) {
    const int len = haystack.length();
    return ( ( pHashes[r] - pHashes[l-1] ) * powers[len - r]) % hashMod;
}

powers[] = 1 31 961 29791 923521 28629151

我的主要问题是将所有子字符串移至相同的幂。

0 个答案:

没有答案