t = [[a, b], [c, d], [a, e], [f, g], [c, d]]
如何获得唯一的列表列表,以便输出等于:
output = [[a, b], [c, d], [a, e], [f, g]]
[c,d]存在两次,因此需要删除。 [a,b]和[a,e]是唯一的列表,无论重复的'a'。
谢谢!
答案 0 :(得分:3)
OrderedDict
将保留顺序,并在我们将子列表映射到元组以使其可以清除时为您提供唯一元素,使用t[:]
wil允许我们改变原始对象/列表。
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
from collections import OrderedDict
t[:] = map(list, OrderedDict.fromkeys(map(tuple, t)))
print(t)
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['g', 'f']]
对于python2,如果要避免创建中间列表,可以使用itertools.imap
:
from collections import OrderedDict
from itertools import imap
t[:] = imap(list, OrderedDict.fromkeys(imap(tuple, t)))
print(t)
您还可以使用set.add or
逻辑:
st = set()
t[:] = (st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st)
print(t)
哪种方法最快:
In [9]: t = [[randint(1,1000),randint(1,1000)] for _ in range(10000)]
In [10]: %%timeit
st = set()
[st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st]
....:
100 loops, best of 3: 5.8 ms per loop
In [11]: timeit list(map(list, OrderedDict.fromkeys(map(tuple, t))))
10 loops, best of 3: 24.1 ms per loop
如果["a","e"]
被认为与["e","a"]
相同,您可以使用冻结集:
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"], ["e","a"]]
st = set()
t[:] = (st.add(frozenset(sub)) or sub for sub in t if frozenset(sub) not in st)
print(t)
输出:
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g']]
为了避免两次调用元组,你可以创建一个函数:
def unique(l):
st, it = set(), iter(l)
for tup in map(tuple, l):
if tup not in st:
yield next(it)
else:
next(it)
st.add(tup)
哪个运行得快一点:
In [21]: timeit list(unique(t))
100 loops, best of 3: 5.06 ms per loop
答案 1 :(得分:2)
一个简单的解决方案
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
output = []
for elem in t:
if not elem in output:
output.append(elem)
print output
输出
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g']]
答案 2 :(得分:0)
您可以使用set
执行此操作(如果内部列表的顺序无关紧要):
>>> t = [['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g'], ['c', 'd']]
>>> as_tuples = [tuple(l) for l in t]
>>> set(as_tuples)
{('a', 'b'), ('a', 'e'), ('c', 'd'), ('f', 'g')}
答案 3 :(得分:0)
一种简单的方法,假设您不想创建新列表并最小化分配。
# Assumption; nested_lst contains only lists with simple values (floats, int, bool)
def squashDups( nested_lst ):
ref_set = set()
new_nested_lst = []
for lst in nested_lst:
tup = tuple(lst)
if tup not in ref_set:
new_nested_lst.append(lst)
ref_set.add(tup)
return new_nested_lst
>>> lst = [ [1,2], [3,4], [3,4], [1,2], [True,False], [False,True], [True,False] ]
>>> squashDups(lst)
[[1, 2], [3, 4], [True, False], [False, True]]
答案 4 :(得分:-1)
如果你关心订单,这应该有效:
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
i = len(t) - 1
while i >= 0:
if t.count(t[i]) > 1:
t.pop(i)
i -= 1
print(t)