C#中的Rabin-Karp算法

时间:2014-04-17 18:46:32

标签: c# algorithm rabin-karp

我已经在C#.NET中实现了Rabin-Karp算法,遵循这个伪代码:

pseudo code

问题是,模式与原始文本不匹配。我彻底完成了代码,但我无法在代码中识别出问题。有人能告诉我代码中的错误吗?

static void Main(string[] args)
{
    string text = "ratcatpat catbats";
    string pattern = "cat";

    int d = text.Select(e => e).Distinct().Count();

    RabinCarp(text, pattern, d, 17);

    Console.ReadKey();
}

static void RabinCarp(string text, string pattern, int sizeOfAlphabet, int moduloValue)
{ 
    int rollingHashOf_P = 0;
    int rollingHashOf_T = 0;

    int lengthOfText = text.Length;
    int lengthOfPattern = pattern.Length;
    int h = (int)(Math.Pow(sizeOfAlphabet, lengthOfPattern - 1) % moduloValue);

    for (int i = 0; i < lengthOfPattern; i++)
    {
        rollingHashOf_P = (sizeOfAlphabet * rollingHashOf_P + (int)pattern[i]) % moduloValue;
        rollingHashOf_T = (sizeOfAlphabet * rollingHashOf_T + (int)text[i]) % moduloValue;
    }

    int diffNM = lengthOfText - lengthOfPattern;

    for (int i = 0; i <= diffNM; i++)
    {
        if (Math.Abs(rollingHashOf_P) == Math.Abs(rollingHashOf_T))
        {
            if (text.Substring(i, lengthOfPattern).Contains(pattern))
            {
                string message = "pattern identified";
                Console.WriteLine(message);
            }
        }   
        if (i < diffNM)
        {
            rollingHashOf_T = Math.Abs(sizeOfAlphabet * (rollingHashOf_T - (int)text[i] * h) + (int)text[i + lengthOfPattern]) % moduloValue;
        }
    }
}

1 个答案:

答案 0 :(得分:2)

我不熟悉Rabin-Karp算法,但我确信你应该提升rollingHashOf_P以及rollingHashOf_T

<击>
if (i < diffNM)
{
    rollingHashOf_T = Math.Abs(sizeOfAlphabet * (rollingHashOf_T - (int)text[i] * h) + (int)text[i + lengthOfPattern]) % moduloValue;
    rollingHashOf_P = Math.Abs(sizeOfAlphabet * (rollingHashOf_P - (int)pattern[i] * h) + (int)pattern[i + lengthOfPattern]) % moduloValue;
}

<击>

OP在下面的评论中共享了这个伪代码:

pseudo code

很明显,以上是错误的。将其与帖子中的代码进行比较,虽然显示错误可能在推进rollingHashOf_T之后,正如它所说的那样:

rollingHashOf_T = Math.Abs(sizeOfAlphabet * (rollingHashOf_T - 
  (int)text[i] * h) + (int)text[i + lengthOfPattern]) % moduloValue;

虽然伪代码表明它应该是:

rollingHashOf_T = Math.Abs(sizeOfAlphabet * (rollingHashOf_T - 
  (int)text[i + 1] * h) + (int)text[i + lengthOfPattern + 1]) % moduloValue;