从设置python删除第一个重复项,并仅保留最后一个唯一项

时间:2020-04-20 19:49:54

标签: python

我正在尝试使用此功能

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

所以我要检查

f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) 

我正在尝试这样做,输出是

[5, 9, 6, 8, 7]

这仅保留第一个值。但我只需要保留最后一个元素。

所以输出应该是

[5, 7, 8, 6, 9]

6 个答案:

答案 0 :(得分:2)

这可能有效:

In [1847]: def f7(seq): 
      ...:     seen = set() 
      ...:     seen_add = seen.add 
      ...:     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 
      ...:                                                                                                                                                                                                  

In [1848]: f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])                                                                                                                                                          
Out[1848]: [5, 7, 8, 6, 9]

答案 1 :(得分:0)

您可以通过创建set来收集唯一项,然后按反向序列的索引排序以找到该元素的最后一个实例,然后反向返回原始顺序

def f7(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

>>> f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])
[5, 7, 8, 6, 9]

答案 2 :(得分:0)

反向执行。这是一个粗略的实现:

l = [5, 5, 9, 6, 8, 7, 7, 8, 6, 9]
def f7(seq):
    seen = set()
    seen_add = seen.add
    return list(reversed([x for x in reversed(seq) if not (x in seen or seen_add(x))]))

print(f7(l))

输出:

[5, 7, 8, 6, 9]

如果您想提高效率,可以使用降序for循环和/或collections.deque

答案 3 :(得分:0)

您可以先反转列表,然后再反转答案。

>>> def f7(seq):
...     seen = set()
...     seen_add = seen.add
...     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1]
...
>>> print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) )
[5, 7, 8, 6, 9]

答案 4 :(得分:0)

您可以使用dict.fromkeys

def f7(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]))
# [5, 7, 8, 6, 9]

如果您的python版本是> = 3.6,这将起作用,因为它基于字典中的插入顺序


这是提出的解决方案的简单基准:

enter image description here

from simple_benchmark import BenchmarkBuilder
import random
from collections import deque


b = BenchmarkBuilder()

@b.add_function()
def MayankPorwal(seq):
    seen = set() 
    seen_add = seen.add 
    return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 

@b.add_function()
def CoryKramer(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

@b.add_function()
def kederrac(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

@b.add_function()
def LeKhan9(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)

@b.add_arguments('List lenght')
def argument_provider():
    for exp in range(2, 14):
        size = 2**exp
        yield size, [random.randint(0, size) for _ in range(size)]

r = b.run()
r.plot()

答案 5 :(得分:0)

为了避免两次反转,可以使用队列数据结构。此处的附件应在固定时间内运行。

from collections import deque

def f7(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)


print f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])

输出:[5, 7, 8, 6, 9]