Question

可能重复：
  How do you remove duplicates from a list in Python whilst preserving order?
  In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?

我想知道是否有一个功能可以执行以下操作：

以列表作为参数：

list = [ 3 , 5 , 6 , 4 , 6 , 2 , 7 , 6 , 5 , 3 ]

并删除列表中的所有重复以获取：

list = [ 3 , 5 , 6 , 4 , 2 , 7 ]

我知道你可以将它转换成字典，并使用字典不能重复的事实，但我想知道是否有更好的方法。

由于

Answer 1

有关实现此目的的三种方法，请参阅Python documentation。从该站点复制以下内容。将示例'mylist'替换为您的变量名称（'list'）。

第一个示例：如果您不介意重新排序列表，请对其进行排序，然后从列表末尾进行扫描，删除重复项：

if mylist:
    mylist.sort()
    last = mylist[-1]
    for i in range(len(mylist)-2, -1, -1):
        if last == mylist[i]:
            del mylist[i]
        else:
            last = mylist[i]

第二个例子：如果列表的所有元素都可以用作字典键（即它们都是可以删除的），这通常会更快：

d = {}
for x in mylist:
    d[x] = 1
mylist = list(d.keys())

第三个例子：在Python 2.5及更高版本中：

mylist = list(set(mylist))

Answer 2

即使您说您不一定要使用dict，我认为OrderedDict在这里是一个干净的解决方案。

from collections import OrderedDict

l = [3 ,5 ,6 ,4 ,6 ,2 ,7 ,6 ,5 ,3]
OrderedDict.fromkeys(l).keys()
# [3, 5, 6, 4, 2, 7]

请注意，这会保留原始订单。

Answer 3

list(set(list))效果很好。

Answer 4

首先，不要将其命名为阴影内置类型列表。说，my_list

为了解决您的问题，我最常见的方式是list(set(my_list))

set是一个只有唯一元素的无序容器，并给出（我认为）O（1）插入和检查成员资格

Answer 5

list(set(l))不会保留订单。如果您想保留订单，请执行以下操作：

s = set()
result = []
for item in l:
    if item not in s:
        s.add(item)
        result.append(item)

print result

这将在O（n）中运行，其中n是原始列表的长度。

Answer 6

在撰写此答案时，保留顺序的唯一解决方案是OrderedDict解决方案，以及Dave稍微冗长的解决方案。

这是我们在迭代时滥用副作用的另一种方式，这也比OrderedDict解决方案更冗长：

def uniques(iterable):
    seen = set()
    sideeffect = lambda _: True
    return [x for x in iterable 
            if (not x in seen) and sideeffect(seen.add(x))]

Answer 7

一个集合比O复杂的字典术语更好。但是这两种方法都会让你失去顺序（除非你使用有序字典，这再次增加了复杂性）。

正如其他海报所说，设定的解决方案并不那么难：

l = [ 3 , 5 , 6 , 4 , 6 , 2 , 7 , 6 , 5 , 3 ]
list(set(l))

保持订购的方法是：

def uniques(l):
    seen = set()

    for i in l:
        if i not in seen:
            seen.add(i)
            yield i

或者，以一种不太可读的方式：

def uniques(l):
    seen = set()
    return (seen.add(i) or i for i in l if i not in seen)

然后您可以像这样使用它：

l = [ 3 , 5 , 6 , 4 , 6 , 2 , 7 , 6 , 5 , 3 ]
list(uniques(l))
>>> [3, 5, 6, 4, 2, 7]

Answer 8

这是我自己的方便Python工具集的片段 - 它使用了ninjagecko在他的答案中的“滥用副作用”方法。这也很难处理不可散列的值，并返回与传入的类型相同的序列：

def unique(seq, keepstr=True):
    """Function to keep only the unique values supplied in a given 
       sequence, preserving original order."""

    # determine what type of return sequence to construct
    if isinstance(seq, (list,tuple)):
        returnType = type(seq)
    elif isinstance(seq, basestring):
        returnType = (list, type(seq)('').join)[bool(keepstr)] 
    else:
        # - generators and their ilk should just return a list
        returnType = list

    try:
        seen = set()
        return returnType(item for item in seq if not (item in seen or seen.add(item)))
    except TypeError:
        # sequence items are not of a hashable type, can't use a set for uniqueness
        seen = []
        return returnType(item for item in seq if not (item in seen or seen.append(item)))

以下是各种调用，包括各种类型的序列/迭代器/生成器：

from itertools import chain
print unique("ABC")
print unique(list("ABABBAC"))
print unique(range(10))
print unique(chain(reversed(range(5)), range(7)))
print unique(chain(reversed(xrange(5)), xrange(7)))
print unique(i for i in chain(reversed(xrange(5)), xrange(7)) if i % 2)

打印：

ABC
['A', 'B', 'C']
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[4, 3, 2, 1, 0, 5, 6]
[4, 3, 2, 1, 0, 5, 6]
[3, 1, 5]

删除列表python中的重复

8 个答案: