Question

如何创建一个功能sublist()，其中包含两个列表list1和list2，如果True是list1的子列表，则返回list2 1}}和False否则。如果list1中的数字list2中的数字与list1中出现的数字相同，则list2是list1的子列表，但不一定是连续的。例如，

>>> sublist([1, 12, 3],[25, 1, 30, 12, 3, 40])
True

>>> sublist([5, 90, 2],[90, 20, 5, 2, 17])
False

Answer 1

这是使用迭代器在线性时间（和常量空间）中进行此操作的一种方法：

def sublist(a, b):
    seq = iter(b)
    try:
        for x in a:
            while next(seq) != x: pass
        else:
            return True
    except StopIteration:
        pass
    return False

基本上，它遍历子列表的每个元素，并查看它是否可以在完整列表的部分中找到它尚未查看的相同元素。如果它通过整个子列表，则意味着我们有一个匹配（因此for循环中的else语句）。如果我们在完整列表中看不到元素，那就意味着我们没有匹配。

编辑：我已更新我的解决方案，因此适用于Python 3.对于Python 2.5及更早版本，next(seq)需要替换为seq.next()。

Answer 2

一个非常粗略的解决方案：

def sublist(a, b):
    if not a:
        return True
    for k in range(len(b)):
        if a[0] == b[k]:
            return sublist(a[1:], b[k+1:])
    return False

print sublist([1, 12, 3], [25, 1, 30, 12, 3, 40]) # True
print sublist([12, 1, 3], [25, 1, 30, 12, 3, 40]) # False

编辑：加速升级

Answer 3

这是一个简化版本：

def sublist(a,b):
    try:
        return a[0] in b and sublist(a[1:],b[1+b.index(a[0]):])
    except IndexError:
        return True

>>> print sublist([1, 12, 3],[25, 1, 30, 12, 3, 40])
True

>>> print sublist([5, 90, 2],[90, 20, 5, 2, 17])
False

Answer 4

这是一个迭代解决方案，应具有最优渐近：

def sublist(x, y):
    if x and not y:
        return False
    i, lim = 0, len(y)
    for e in x:
        while e != y[i]:
            i += 1
            if i == lim:
                return False
        i += 1
    return True

@sshashank124的解决方案具有相同的复杂性，但动态会有所不同：他的版本多次遍历第二个参数，但因为它将更多的工作推入C层，所以在较小的输入上它可能会快得多。 / p>
编辑：@ hetman的解决方案基本上具有相同的逻辑，但更多的是Pythonic，尽管与我的预期相反，它似乎稍微慢了一些。（我对@ sshashan124解决方案的性能也不正确;递归调用的开销似乎超过了在C中做更多工作的好处。）

Answer 5

祝贺一个看似棘手的问题。我认为这样可行但如果我错过了一个角落案例，特别是重复的元素，我不会感到震惊。受Hgu Nguyen递归解决方案启发的修订版本：

def sublist(a, b):
    index_a = 0
    index_b = 0
    len_a = len(a)
    len_b = len(b)
    while index_a < len_a and index_b < len_b:
        if a[index_a] == b[index_b]:
            index_a += 1
            index_b += 1
        else:
            index_b += 1
    return index_a == len_a

一些粗略的分析：

鉴于需要遍历大部分或全部b的列表，我的算法会受到影响：

a = [1, 3, 999999]
b = list(range(1000000))

在我的电脑上，Huu Nguyen或Hetman的算法大约需要10秒才能完成100次迭代检查。我的算法需要20秒。

鉴于早先的成功，Huu的算法大大落后于：

a = [1, 3, 5]

Hetman的算法或者我的算法可以在一秒钟内完成100k的检查 - 在我的PC上以0.13秒的速度完成Hetman，在0.19秒内完成。 Huu需要16秒才能完成1k的检查。我坦率地对这种程度的差异感到震惊 - 如果没有编译器优化，递归可能会很慢，我知道，但是4个数量级比我预期的要差。

给定一个失败列表a，性能回到我需要遍历整个第二个列表时所看到的 - 可以理解，因为没有办法知道最后会有一个序列匹配其他无法匹敌的列表。

a = [3, 1, 5]

再次，Huu Nguyen或Hetman的100次测试算法大约10秒，我的20次。

更长的有序列表保持了我看到的早期成功模式。 EG：

a = range(0, 1000, 20)

Hetman的算法花费10.99秒完成100k测试，而我的算法花了24.08。 Huu用28.88完成了100次测试。

这些都不是你可以运行的全部测试，但在所有情况下，Hetman的算法表现最佳。

Answer 6

这个怎么样：让我们从另一边接近这个：

def sublist(a,b):
    """returns True if a is contained in b and in the same order"""
    return a == [ch for ch in b if ch in a]

在某些情况下这会失败（例如，[1,2,3]应该是[1,1,8,2,3]的一个子集），但很难确切地说出你希望如何实现这个...

Answer 7

对于运行缓慢的快速解决方案，但对于您显示的大小的数组来说，它将完全足够：

def sublist(a,b):
    last = 0
    for el_a in a:
        if el_a in b[last:]:
             last = b[last:].index(el_a)
        else:
             return False
    return True

**编辑为非连续元素工作

Answer 8

这是另一种解决方案，新手可能比Hetman's更容易理解。（请注意，它与this duplicate question中OP的实施非常接近，但避免了每次从b开始重新启动搜索的问题。）

def sublist(a, b):
    i = -1
    try:
        for e in a:
            i = b.index(e, i+1)
    except ValueError:
        return False
    else:
        return True

当然，这需要b为list，而Hetman的回答允许任何可迭代。而且我认为（对于那些理解Python的人来说）它也不如Hetman的答案那么简单。

在算法上，它与Hetman的答案做同样的事情，所以它的O（N）时间和O（1）空间。但实际上，可能更快，至少在CPython中，因为我们将内循环从迭代器周围的Python while移动到C fast-getindex循环（里面） list.index）。然后，它可能会更慢，因为我们正在复制i值而不是将所有状态嵌入（C实现的）迭代器中。如果重要，请使用您的实际数据对它们进行测试。：）

Answer 9

这是使用regex的更好解决方案：

import re


def exiSt(short,long):
    r = ''.join(["(.*"+str[x]+")" for x in short])
    return re.match(r,','.join([str(x) for x in long])) == None

long = [1, 2, 3, 4, 5]
short1 = [1,2,5]
short2 = [1,5,3]

exiSt(short1,long)
>> True

exiSt(short2,long)
>> False

Answer 10

def sublist(a, b):
    "if set_a is not subset of set_b then obvious answer is False"
    if not set(a).issubset(set(b)):
        return False
    n = 0
    for i in a:
        if i in b[n:]:
            "+1 is needed to skip consecutive duplicates, i.e. sublist([2,1,1],[2,1]) = False"
            "if +1 is absent then sublist([2,1,1],[2,1]) = True"
            "choose to use or not to use +1 according to your needs"
            n += b[n:].index(i) + 1
        else:
            return False
    return True

确定列表中的所有元素是否存在，并且在另一个列表中是否存在相同的顺序

10 个答案: