Question

您好我在这里搜索过但找不到问题的答案。

我正在使用Python并有2个列表。它们都是有序的。第一个列表通常是较长的一个（大约10,000个元素），它永远不会改变。第二个是较短的但随着程序的运行而增长，最终长度相同。

列表可能如下所示：

[1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
[1, 1, 2, 2, 3, 4, 16, 18, 19, 20]

在这种情况下，我想返回13，因为它是列表1中不在列表2中的最大元素。

现在我反复这样做，所以列表1需要保持不变。两个列表都包含重复值。

我天真的做事方式太慢了：

def removeItems(list2, list1):
    list1Copy = list(list1)
    for item in list2:
        if item in list1Copy:
            list1Copy.remove(item)

    return list1Copy

所以我只创建一个新列表，然后删除较短列表中存在的所有项目，然后我想要的值是list1Copy中的结束值。

使用dicts或其他东西必须有更快的方法吗？

Answer 1

>>> l1 = [1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
>>> l2 = [1, 1, 2, 2, 3, 4, 16, 18, 19, 20]

您可以获取l1中未出现l2

中所有项目的列表

>>> filter(lambda i : i not in l2, l1)
[5, 5, 6, 7, 8, 8, 10, 11, 12, 13]

然后获取该列表的max

>>> max(filter(lambda i : i not in l2, l1))
13

Answer 2

>>> l1 = [1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 10, 11, 12, 13, 16, 18, 19, 20]
>>> l2 = [1, 1, 2, 2, 3, 4, 16, 18, 19, 20]
>>> max(set(l1) - set(l2))
13

编辑：

>>> l1 = [19, 20, 20]
>>> l2 = [19, 20]
>>> from collections import Counter
>>> max(Counter(l1) - Counter(l2))
20

Answer 3

到目前为止，没有给出任何答案可以利用列表排序的事实，我们希望l1中的最大值不在l2中。这是一个解决方案：

from itertools import zip_longest # note this function is named izip_longest in Python 2

def max_in_l1_not_in_l2(l1, l2):
    if len(l1) <= len(l2):
        raise ValueError("l2 has at least as many items as l1")
    for a, b in zip_longest(reversed(l1), reversed(l2), fillvalue=float("-inf")):
        if a > b:
            return a
        elif a != b:
            raise ValueError("l2 has larger items than l1")
    raise ValueError("There is no value in l1 that is not in l2") # should never get here

如果您可以依赖l2作为l1的正确子集，则可以删除错误检查。如果你把它提炼出来，你最终会得到一个非常简单的循环，它甚至可以成为一个单独的表达式：

next(a for a, b in zip_longest(reversed(l1), reversed(l2), fillvalue=float("-inf"))
       if a > b)

此代码通常比其他实现（例如behzad.nouri's good answer using collections.Counter）更快的原因是，由于反向迭代，当它遇到来自{的值时，它可以立即return结果{1}}不在l1中（它找到的第一个这样的值将是最大的）。执行多集减法将始终处理两个列表的所有值，即使我们可能只需要查看最大的几个值。

这是一个示例，我的代码应该明显快于任何非短路版本：

l2

Answer 4

好的，所以我成功了：

def findLargestUnknownLength(l1, l2):

    l1Index = len(l1) - 1
    l2Index = len(l2) - 1

    while True:
        if l2[l2Index] == l1[l1Index]:
            l1Index -= 1
            l2Index -=1
        elif l2[l2Index] < l1[l1Index]:
            return l1[l1Index]

对于那些想知道的人来说，这是收费公路问题解决方案的一部分。这里有一个很好的描述：Turnpike Walkthrough。

这是Rosalind的一个问题。

查找2个有序列表中的最大元素，这些元素不会出现在一个列表中

4 个答案: