Question

t = [[a, b], [c, d], [a, e], [f, g], [c, d]]

如何获得唯一的列表列表，以便输出等于：

output = [[a, b], [c, d], [a, e], [f, g]]

[c，d]存在两次，因此需要删除。 [a，b]和[a，e]是唯一的列表，无论重复的'a'。

谢谢！

Answer 1

OrderedDict将保留顺序，并在我们将子列表映射到元组以使其可以清除时为您提供唯一元素，使用t[:] wil允许我们改变原始对象/列表。

t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]

from collections import OrderedDict

t[:] = map(list, OrderedDict.fromkeys(map(tuple, t)))

print(t)
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['g', 'f']]

对于python2，如果要避免创建中间列表，可以使用itertools.imap：

from collections import OrderedDict
from itertools import imap

t[:] = imap(list, OrderedDict.fromkeys(imap(tuple, t)))

print(t)

您还可以使用set.add or逻辑：

st = set()

t[:] = (st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st)

print(t)

哪种方法最快：

In [9]: t = [[randint(1,1000),randint(1,1000)] for _ in range(10000)]

In [10]: %%timeit                                                     
st = set()
[st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st]
   ....: 
100 loops, best of 3: 5.8 ms per loop

In [11]: timeit list(map(list, OrderedDict.fromkeys(map(tuple, t))))  
10 loops, best of 3: 24.1 ms per loop

如果["a","e"]被认为与["e","a"]相同，您可以使用冻结集：

t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"], ["e","a"]]
st = set()
t[:] = (st.add(frozenset(sub)) or sub for sub in t if frozenset(sub) not in st)

print(t)

输出：

[['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g']]

为了避免两次调用元组，你可以创建一个函数：

def unique(l):
    st, it = set(), iter(l)
    for tup in map(tuple, l):
        if tup not in st:
            yield next(it)
        else:
            next(it)
        st.add(tup)

哪个运行得快一点：

In [21]: timeit list(unique(t))
100 loops, best of 3: 5.06 ms per loop

Answer 2

一个简单的解决方案

t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
output = []

for elem in t:
    if not elem in output:
        output.append(elem)

print output

输出

[['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g']]

Answer 3

您可以使用set执行此操作（如果内部列表的顺序无关紧要）：

>>> t = [['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g'], ['c', 'd']]
>>> as_tuples = [tuple(l) for l in t]
>>> set(as_tuples)
{('a', 'b'), ('a', 'e'), ('c', 'd'), ('f', 'g')}

Answer 4

一种简单的方法，假设您不想创建新列表并最小化分配。

# Assumption; nested_lst contains only lists with simple values (floats, int, bool)
def squashDups( nested_lst ):
    ref_set = set()
    new_nested_lst = []
    for lst in nested_lst:
        tup = tuple(lst)
        if tup not in ref_set:
            new_nested_lst.append(lst)
            ref_set.add(tup)
    return new_nested_lst

>>> lst = [ [1,2], [3,4], [3,4], [1,2], [True,False], [False,True], [True,False] ]
>>> squashDups(lst)
[[1, 2], [3, 4], [True, False], [False, True]]

Answer 5

如果你关心订单，这应该有效：

t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
i = len(t) - 1
while i >= 0:
    if t.count(t[i]) > 1:
        t.pop(i)
    i -= 1
print(t)

删除其他列表中的重复列表

5 个答案: