Question

我在StackOverflow上发现了类似的问题，但是我的问题有所不同。

给出一个字符串s包含小写的alphabet。我想找到所有子字符串Longest common Prefix的长度。

例如

s = 'ababac'

然后，子字符串如下：

1: s(1, 6) = ababac
2: s(2, 6) = babac
3: s(3, 6) = abac
4: s(4, 6) = bac
5: s(5, 6) = ac
6: s(6, 6) = c

现在，所有子字符串LCP的长度如下

1: len(LCP(s(1, 6), s)) = 6 
2: len(LCP(s(2, 6), s)) = 0
3: len(LCP(s(3, 6), s)) = 3
4: len(LCP(s(4, 6), s)) = 0
5: len(LCP(s(5, 6), s)) = 1
6: len(LCP(s(6, 6), s)) = 0

我正在使用逐字符匹配

    string commonPrefix(string s1, string s2) { 
        int minlen = minlength1(s1, s2); 
        char current; 
        int result = 0;
        for (int i=0; i<minlen; i++) { 
            current = s1[i]; 
            for (int j=1 ; j<n; j++) 
                if (s2[i] != current) 
                return result; 
            result++;
        } 

        return result; 
    }

但是仍然是O（n2）。我知道所有子串都相互重叠，可以进一步优化。谁能帮助优化此代码？

Answer 1

这类似于用于模式匹配的Z算法。除了第一种情况len（LCP（s（1，6），s））= len（s）。

我们需要创建一个Z数组。对于字符串str [0..n-1]，Z数组的长度与字符串相同。 Z数组的元素Z [i]存储从str [i]开始的最长子串的长度，该字符串也是str [0..n-1]的前缀。 Z数组的第一个条目意义不大，因为完整的字符串始终是其自身的前缀。

在此处可视化算法： https://personal.utdallas.edu/~besp/demo/John2010/z-algorithm.htm

以下是相同的解决方案：

public static int[] computeZ(String s) {
    int l = 0; r = 0;
    int [] Z = new int[len];
    int len = s.length();
    for (int k =0 ; k < len; k++ ) {
        int j;
        if (k < r) {
            j = (z[k-l] < (r-k)) ? z[k-l] : (r-k)
        } else {
            j = 0;
        }
        while (k + j < len) {
            if (s.charAt(k+j) == s.charAt(j)) {
                j++;
            } else {
                break;
            }
        }
        if (k + j > r) {
            l = k;
            r = k + j;
        }
    }
    Z[0] = len;
    return Z;
}

Answer 2

如Aditya所述，可以使用Z算法来解决。请在此处找到实现的详细说明-https://www.hackerearth.com/practice/algorithms/string-algorithm/z-algorithm/tutorial/

所有子字符串和一个字符串的最长公共前缀长度

2 个答案: