我想删除那些在索引0处具有相同值的元组,除了第一次出现。我看了其他类似的问题,但没有得到我正在寻找的特定答案。有人能帮帮我吗? 以下是我的尝试。
from itertools import groupby
import random
Newlist = []
abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]
Newlist = [random.choice(tuple(g)) for _, g in groupby(abc, key=lambda x: x[0])]
print Newlist
我的预期输出:[(1,2,3), (2,3,4), (0,2,0), (5,4,3)]
答案 0 :(得分:5)
一种简单的方法是循环遍历列表并跟踪您已找到的元素:
abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]
found = set()
NewList = []
for a in abc:
if a[0] not in found:
NewList.append(a)
found.add(a[0])
print(NewList)
#[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
found
是set
。在每次迭代中,我们检查元组中的第一个元素是否已经在found
中。如果没有,我们将整个元组追加到NewList
。在每次迭代结束时,我们将元组的第一个元素添加到found
。
答案 1 :(得分:3)
itertools recipes(Python 2:itertools recipes,但在这种情况下基本上没有区别)包含一个配方,比implementation {{3}更通用}}。它还使用set
:
Python 2:
from itertools import ifilterfalse as filterfalse
Python 3:
from itertools import filterfalse
def unique_everseen(iterable, key=None): "List unique elements, preserving order. Remember all elements ever seen." # unique_everseen('AAAABBBCCDAABBB') --> A B C D # unique_everseen('ABBCcAD', str.lower) --> A B C D seen = set() seen_add = seen.add if key is None: for element in filterfalse(seen.__contains__, iterable): seen_add(element) yield element else: for element in iterable: k = key(element) if k not in seen: seen_add(k) yield element
将其用于:
abc = [(1,2,3), (2,3,4), (1,0,3),(0,2,0), (2,4,5),(5,4,3), (0,4,1)]
Newlist = list(unique_everseen(abc, key=lambda x: x[0]))
print Newlist
# [(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
由于set.add
方法的缓存(仅在您的abc
很大时才真正相关),这应该稍快一些,并且还应该更加通用,因为它会使key
函数一个参数。
除此之外,我在评论中已经提到的相同限制适用:这仅在元组的第一个元素实际可以清洗时才有效(当然,这些数字与给定的例子一样)。
答案 2 :(得分:2)
使用OrderedDict
的更好选择:
from collections import OrderedDict
abc = [(1,2,3), (2,3,4), (1,0,3), (0,2,0), (2,4,5),(5,4,3), (0,4,1)]
d = OrderedDict()
for t in abc:
d.setdefault(t[0], t)
abc_unique = list(d.values())
print(abc_unique)
输出:
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
简单但效率不高:
abc = [(1,2,3), (2,3,4), (1,0,3), (0,2,0), (2,4,5),(5,4,3), (0,4,1)]
abc_unique = [t for i, t in enumerate(abc) if not any(t[0] == p[0] for p in abc[:i])]
print(abc_unique)
输出:
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
答案 3 :(得分:2)
但问题是明确维持秩序 元组。我不认为有使用groupby的解决方案
我绝不会错过使用groupby()
的机会。这是我的解决方案没有排序(一次或两次):
from itertools import groupby, chain
abc = [(1, 2, 3), (2, 3, 4), (1, 0, 3), (0, 2, 0), (2, 4, 5), (5, 4, 3), (0, 4, 1)]
Newlist = list((lambda s: chain.from_iterable(g for f, g in groupby(abc, lambda k: s.get(k[0]) != s.setdefault(k[0], True)) if f))({}))
print(Newlist)
<强>输出强>
% python3 test.py
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
%
答案 4 :(得分:1)
要正确使用groupby
,必须对序列进行排序:
>>> [next(g) for k,g in groupby(sorted(abc, key=lambda x:x[0]), key=lambda x:x[0])]
[(0, 2, 0), (1, 2, 3), (2, 3, 4), (5, 4, 3)]
或者如果你需要你的例子的非常精确的顺序(即保持原始顺序):
>>> [t[2:] for t in sorted([next(g) for k,g in groupby(sorted([(t[0], i)+t for i,t in enumerate(abc)]), lambda x:x[0])], key=lambda x:x[1])]
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]
这里的技巧是在groupby()步骤之后添加一个字段以保持原始顺序恢复。
编辑:甚至更短:
>>> [t[1:] for t in sorted([next(g)[1:] for k,g in groupby(sorted([(t[0], i)+t for i,t in enumerate(abc)]), lambda x:x[0])])]
[(1, 2, 3), (2, 3, 4), (0, 2, 0), (5, 4, 3)]