所以我现在正在参加一个计算机科学入门课程,我想知道如何检查多个列表中是否有任何重复项目。我已经阅读了这些答案:
How can I compare two lists in python and return matches和How to find common elements in list of lists?
但是,它们并不是我想要的。比方说我有这个列表列表:
list_x = [[66,76],
[25,26,27],
[65,66,67,68],
[40,41,42,43,44],
[11,21,31,41,51,61]]
有两套重复(66和41),虽然这对我来说并不重要。有没有办法找出是否存在重复项?我正在寻找的是,如果有重复,该函数将返回True(或False,取决于我想对列表做什么)。我得到的印象是我应该使用集合(我们没有学到这些,所以我在互联网上查找),使用循环或编写我自己的函数。如果我需要编写自己的函数,请告诉我,我将在今天晚些时候进行编辑!
答案 0 :(得分:3)
一个非常简单的解决方案是使用list comprehension首先展平列表,然后一起使用set
和len
来测试任何重复项:
>>> list_x = [[66,76],
... [25,26,27],
... [65,66,67,68],
... [40,41,42,43,44],
... [11,21,31,41,51,61]]
>>> flat = [y for x in list_x for y in x]
>>> flat # Just to demonstrate
[66, 76, 25, 26, 27, 65, 66, 67, 68, 40, 41, 42, 43, 44, 11, 21, 31, 41, 51, 61]
>>> len(flat) != len(set(flat)) # True because there are duplicates
True
>>>
>>> # This list has no duplicates...
... list_x = [[1, 2],
... [3, 4, 5],
... [6, 7, 8, 9],
... [10, 11, 12, 13],
... [14, 15, 16, 17, 18]]
>>> flat = [y for x in list_x for y in x]
>>> len(flat) != len(set(flat)) # ...so this is False
False
>>>
但请注意,如果list_x
很大,这种方法会有些慢。如果需要考虑性能,那么您可以使用利用generator expression,any
和set.add
的惰性方法:
>>> list_x = [[66,76],
... [25,26,27],
... [65,66,67,68],
... [40,41,42,43,44],
... [11,21,31,41,51,61]]
>>> seen = set()
>>> any(y in seen or seen.add(y) for x in list_x for y in x)
True
>>>
答案 1 :(得分:1)
迭代并使用一个集来检测是否存在重复:
seen = set()
dupes = [i for lst in list_x for i in lst if i in seen or seen.add(i)]
这利用了seen.add()
返回None
的事实。 set
是一组无序的唯一值;如果i in seen
已经是该集合的一部分,则True
测试为i
。
演示:
>>> list_x = [[66,76],
... [25,26,27],
... [65,66,67,68],
... [40,41,42,43,44],
... [11,21,31,41,51,61]]
>>> seen = set()
>>> [i for lst in list_x for i in lst if i in seen or seen.add(i)]
[66, 41]
答案 2 :(得分:0)
以下是使用集合的更直接的解决方案:
list_x = [[66,76],
[25,26,27],
[65,66,67,68],
[40,41,42,43,44],
[11,21,31,41,51,61]]
seen = set()
duplicated = set()
for lst in list_x:
numbers = set(lst) # only unique
# make intersection with seen and add to duplicated:
duplicated |= numbers & seen
# add numbers to seen
seen |= numbers
print duplicated
有关set
及其操作的信息,请参阅文档:https://docs.python.org/2/library/stdtypes.html#set