最长的公共序列而不是单词

时间:2018-07-13 18:31:20

标签: python string longest-substring

我正在使用一个列表,并从字符串的最后一个元素到第一个元素进行迭代,以找到它们之间最长的公共子词。因此,当前迭代的值(LC [i] [j])将取决于先前的迭代LC [i + 1] [j + 1]和元素的当前匹配情况

我面临两个问题。 -else语句LC [i] [j] = 0强制元素LC [i + 1] [j + 1] = 0,因此,如果当前迭代中存在匹配项,则LC [i] [j]正变为1(因为LC [i + 1] [j + 1]已被覆盖为0。-在删除else语句而不是找到最长的公共词时,它给出了最长的公共子序列。请查看两种情况的输出

case 1. when else is present. 
1st str: zxab enter 2nd str: yzab 
i = 3 j = 3 LC[i][j] = 1 LC[i+1][j+1] = 0 
i = 2 j = 2 LC[i][j] = 1 LC[i+1][j+1] = 0 
i = 0 j = 1 LC[i][j] = 1 LC[i+1][j+1] = 0 
max len = 1



case 2. else block removed.
i = 3 j = 3 LC[i][j] = 1 LC[i+1][j+1] = 0 
i = 2 j = 2 LC[i][j] = 2 LC[i+1][j+1] = 1
i = 0 j = 1 LC[i][j] = 3 LC[i+1][j+1] = 2 

max len = 3
correct answer for max_len should have been 2.



def LCW(u,v):
    m = len(u)
    n = len(v)

    LC = [[0] * (len(v) + 1)] * (len(u) + 1)  # create the table one extra due to denote the endng of the word

    max_len = 0

    for i in range(m-1,-1,-1):    # because the strng's last elemnt can be accessed by range(m) == m-1
        for j in range(n-1,-1,-1):
            if u[i] == v[j]:
                LC[i][j] = 1 + LC[i+1][j+1]

            #else: LC[i][j] = 0
            if max_len < LC[i][j]:
                max_len = LC[i][j]
    return max_len

1 个答案:

答案 0 :(得分:0)

我认为您会迷失在二维列表中,这不是必需的。您可以只使用in运算符,该运算符指出可以在另一个字符串中找到一个字符串。那么唯一的问题是创建一个输入的所有子字符串。

def sliding_window(seq, width):
    """yields all substrings of `seq` of length `width`"""
    return (seq[idx:idx+width] for idx in range(len(seq)-width+1))

def lcw(seq1, seq2):
    """length of the longest common sequence shared by seq1 and seq2"""
    max_width = min(len(seq1), len(seq2))
    for width in range(max_width, 0, -1):
        if any(sub in seq1 for sub in sliding_window(seq2, width)):
            return width
    return 0

lcw('zxab', 'yzab')
>>>
2

但是很可能是我没有得到您的定义子词,我无法真正将其与子序列区分开。