Question

我希望找到2个给定字符串的最长公共子串递归。我写了这个代码，但效率太低。有一种方法我可以在O（m * n）这里做，n和n各自的长度string.here是我的代码：

def lcs(x,y):
    if len(x)==0 or len(y)==0:
       return " "
    if x[0]==y[0]:
       return x[0] + lcs(x[1:],y[1:])
    t1 = lcs(x[1:],y)
    t2 = lcs(x,y[1:])
    if len(t1)>len(t2):
        return t1
    else:
        return t2
x = str(input('enter string1:'))
y = str(input('enter string2:'))
print(lcs(x,y))

Answer 1

您需要memoize递归。如果没有这个，你将最终得到一个指数的调用，因为你会一遍又一遍地反复解决同样的问题。为了使memoized查找更有效，您可以根据后缀长度而不是实际后缀来定义递归。

您还可以在维基百科上找到DP的pseudocode。

Answer 2

这是一个天真的非递归解决方案，使用itertools中的powerset()食谱：

from itertools import chain, combinations, product


def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s) + 1))


def naive_lcs(a, b):
    return ''.join(max(set(powerset(a)) & set(powerset(b)), key=len))

它有问题：

>>> naive_lcs('ab', 'ba')
'b'
>>> naive_lcs('ba', 'ab')
'b'

对于某些字符串对可以有多个解决方案，但我的程序会随意选择一个。

此外，由于组合中的任何可能是最常见的组合，并且由于计算这些组合需要O（2 ^ n）时间，因此该解决方案不计算在O（n * m））时间。通过动态编程和记忆OTOH，我们可以找到理论上应该表现更好的solution：

from functools import lru_cache


@lru_cache()
def _dynamic_lcs(xs, ys):
    if not (xs and ys):
        return set(['']), 0
    elif xs[-1] == ys[-1]:
        result, rlen = _dynamic_lcs(xs[:-1], ys[:-1])
        return set(each + xs[-1] for each in result), rlen + 1
    else:
        xlcs, xlen = _dynamic_lcs(xs, ys[:-1])
        ylcs, ylen = _dynamic_lcs(xs[:-1], ys)
        if xlen > ylen:
            return xlcs, xlen
        elif xlen < ylen:
            return ylcs, ylen
        else:
            return xlcs | ylcs, xlen


def dynamic_lcs(xs, ys):
    result, _ = _dynamic_lcs(xs, ys)
    return result


if __name__ == '__main__':
    seqs = list(powerset('abcde'))
    for a, b in product(seqs, repeat=2):
        assert naive_lcs(a, b) in dynamic_lcs(a, b)

dynamic_lcs()还解决了一些对字符串可以具有多个常见的最长子序列的问题。结果是这些的集合，而不是一个字符串。通过is still of exponential complexity找到所有常见子序列的集合。

感谢Pradhan提醒我动态编程和记忆。

比较2个字符串的常见子串

2 个答案: