找到前缀也是后缀

时间:2014-04-15 17:19:17

标签: string algorithm substring

我正在寻找这个问题的最佳解决方案。

给定string s of length n,从左到右找到一个前缀,相当于从右到左的后缀。

前缀和后缀可以重叠。

示例:给定abababa,前缀为[ababa]ba,后缀为ab[ababa]

我能够继续这个

  1. 对于每个i = 0 to n-1,取前缀为i的前缀,并查找是否有适当的后缀。现在是O(n^2)时间和O(1)空间。

  2. 我想出了一个优化,我们索引所有角色的位置。这样,我们可以从1 /中消除一组样本空间。同样,最差情况的复杂性为O(n^2),且O(n)有额外空间。

  3. 有没有更好的算法呢?

3 个答案:

答案 0 :(得分:3)

在C#中简单实现:

        string S = "azffffaz";

        char[] characters = S.ToCharArray();
        int[] cumulativeCharMatches = new int[characters.Length];
        cumulativeCharMatches[0] = 0;

        int prefixIndex = 0;
        int matchCount = 0;

        // Use KMP type algorithm to determine matches.

        // Search for the 1st character of the prefix occurring in a suffix.
        // If found, assign count of '1' to the equivalent index in a 2nd array.
        // Then, search for the 2nd prefix character.
        // If found, assign a count of '2' to the next index in the 2nd array, and so on.
        // The highest value in the 2nd array is the length of the largest suffix that's also a prefix.
        for (int i = 1; i < characters.Length; i++)
        {
            if (characters[i] == characters[prefixIndex])
            {
                matchCount += 1;
                prefixIndex += 1;
            }
            else
            {
                matchCount = 0;
                prefixIndex = 0;
            }

            cumulativeCharMatches[i] = matchCount;
        }

        return cumulativeCharMatches.Max();

答案 1 :(得分:2)

使用KMP算法。算法的状态确定&#34;干草堆的最长后缀,它仍然是针的前缀&#34;。所以只需将你的字符串作为针和没有第一个字符的字符串作为haystack。在O(N)时间和O(N)空间内运行。

带有一些示例的实现:

public static int[] create(String needle) {
    int[] backFunc = new int[needle.length() + 1];
    backFunc[0] = backFunc[1] = 0;
    for (int i = 1; i < needle.length(); ++i) {
        int testing = i - 1;
        while (backFunc[testing] != testing) {
            if (needle.charAt(backFunc[testing]) == needle.charAt(i-1)) {
                backFunc[i] = backFunc[testing] + 1;
                break;
            } else {
                testing = backFunc[testing];
            }
        }
    }
    return backFunc;
}

public static int find(String needle, String haystack) {
    // some unused character to ensure that we always return back and never reach the end of the
    // needle
    needle = needle + "$";
    int[] backFunc = create(needle);
    System.out.println(Arrays.toString(backFunc));
    int curpos = 0;
    for (int i = 0; i < haystack.length(); ++i) {
        while (curpos != backFunc[curpos]) {
            if (haystack.charAt(i) == needle.charAt(curpos)) {
                ++curpos;
                break;
            } else {
                curpos = backFunc[curpos];
            }
        }
        if (curpos == 0 && needle.charAt(0) == haystack.charAt(i)) {
            ++curpos;
        }
        System.out.println(curpos);
    }
    return curpos;
}

public static void main(String[] args) {
    String[] tests = {"abababa", "tsttst", "acblahac", "aaaaa"};
    for (String test : tests) {
        System.out.println("Length is : " + find(test, test.substring(1)));
    }
}

答案 2 :(得分:0)

见:

http://algorithmsforcontests.blogspot.com/2012/08/borders-of-string.html

对于O(n)解决方案

代码实际上计算了前缀中最后一个字符的索引。对于实际的前缀/后缀,您需要提取从0到j的子字符串(两者都包含,长度为j + 1)