Question

我正在阅读维基百科上的KMP algorithm。 “表格构建算法的伪代码说明”部分中有一行代码让我感到困惑：let cnd ← T[cnd]

它有一个评论：(second case: it doesn't, but we can fall back)，我知道我们可以退回，但为什么T [cnd]，有原因吗？因为它让我很困惑。

这是表格构建算法的完整伪代码：

algorithm kmp_table:
    input:
        an array of characters, W (the word to be analyzed)
        an array of integers, T (the table to be filled)
    output:
        nothing (but during operation, it populates the table)

    define variables:
        an integer, pos ← 2 (the current position we are computing in T)
        an integer, cnd ← 0 (the zero-based index in W of the next 
character of the current candidate substring)

    (the first few values are fixed but different from what the algorithm 
might suggest)
    let T[0] ← -1, T[1] ← 0

    while pos < length(W) do
        (first case: the substring continues)
        if W[pos - 1] = W[cnd] then
            let cnd ← cnd + 1, T[pos] ← cnd, pos ← pos + 1

        (second case: it doesn't, but we can fall back)
        else if cnd > 0 then
            let cnd ← T[cnd]

        (third case: we have run out of candidates.  Note cnd = 0)
        else
            let T[pos] ← 0, pos ← pos + 1

Answer 1

您可以回退到T[cnd]，因为它包含模式 W 的前一个最长正确前缀的长度，这也是W[0...cnd]的正确后缀。因此，如果W[pos-1]处的当前字符与W[T[cnd]]处的字符匹配，则可以延长W[0...pos-1]的最长正确前缀的长度（这是第一种情况）。

我想它有点像动态编程，你依赖于先前计算的值。

This 说明可能会对您有所帮助。

KMP中的“部分匹配”表（又名“失败函数”）（在维基百科上）

1 个答案: