Question

KMP algorithm for string matching。以下是我在网上找到的用于计算最长前缀后缀数组的code：
定义：

lps[i] = the longest proper prefix of pat[0..i] 
              which is also a suffix of pat[0..i].

代码：

void computeLPSArray(char *pat, int M, int *lps)
{
    int len = 0;  // length of the previous longest prefix suffix
    int i;

    lps[0] = 0; // lps[0] is always 0
    i = 1;

    // the loop calculates lps[i] for i = 1 to M-1
    while(i < M)
    {
       if(pat[i] == pat[len])
       {
         len++;
         lps[i] = len;
         i++;
       }
       else // (pat[i] != pat[len])
       {
         if( len != 0 )
         {
           // This is tricky. Consider the example AAACAAAA and i = 7.
           len = lps[len-1]; //*****************

           // Also, note that we do not increment i here
         }
         else // if (len == 0)
         {
           lps[i] = 0;
           i++;
         }
       }
    }
}

我可以使用len = len-1代替len = lps[len-1]吗？因为len总是像[0 .. someIndex]那样计算前缀长度。那么为什么在这里使用lps进行分配呢？以下是我测试哪些工作正常的情况（第一行是模式，后两行是原始和修改后的len赋值的结果）：

a  a  a  b  a  b  c  
0  1  2  0  1  0  0  
0  1  2  0  1  0  0 

a  b  c  b  a  b  c  
0  0  0  0  1  2  3  
0  0  0  0  1  2  3  

a  a  b  c  b  a  b  
0  1  0  0  0  1  0  
0  1  0  0  0  1  0

此处的代码包含两种变体：http://ideone.com/qiSrUo

Answer 1

在一个不起作用的案例之后：

i     0  1  2  3  4  5
p     A  B  A  B  B  A 
c1    0  0  1  2  0  1
c2    0  0  1  2  2  3

原因是：

At i=4, len=2 
p[i]='B' and p[len]='A' //Mismatch!
lps string upto i=3: AB(0-1 prefix), (2-3 suffix)
-------------------------------
i=4
Next charecter: B
len=2 // longest prefix suffix length 
Charecter looking for : A (=p[len])

所以在i = 3时我们将AB（0-1）作为与后缀AB（2-3）匹配的前缀，但是现在在i = 4时存在不匹配，所以我们看到可以＆＃ 39; t 扩展原始前缀（0-1），因此要检查的位置是在＆＃34; AB＆＃34;之前找到的前缀。这是由lps [len-1]＆lt; -1，因为数组从0开始>这不一定是len-1，因为我们可能需要退一步，以获得新的最长前缀后缀。

Answer 2

以下是我见过的最佳解释。示例中的内容将清楚地回答您的问题。

Knuth–Morris–Pratt(KMP) Pattern Matching(Substring search)

Answer 3

这是我的KMP代码： -

#include <bits/stdc++.h>
using namespace std;


int main(void){
    int t;
    scanf("%d",&t);
    while(t--){
        string s;
        cin>>s;
        int n = s.length();
        int arr[n];
        arr[0] = 0;
        int len = 0;
        for(int i = 1;i<n;){
            if(s[len]==s[i]){
                len++;
                arr[i++] = len;
            }
            else{
                if(len!=0){
                    len = arr[len-1];
                }
                else{
                    arr[i] = 0;
                    i++;
                }
            }
        }
        cout<<arr[n-1]<<endl;
    }


    return 0;
}

时间复杂度是O（N）

字符串匹配：计算kmp算法中的最长前缀后缀数组

3 个答案: