在元组列表中识别不同顺序的重复元组

时间:2021-05-15 01:25:04

标签: python-3.x

我有一个名为“uplink”的元组列表,如下所示:

uplink is [(6, 26), (15, 26), (26, 48), (26, 65), (48, 26), (48, 92), (65, 26), (65, 92), (88, 26), (92, 48), (92, 65)]

我想识别包含相同条目(不同顺序)的元组,例如 (48,92) 和 (92,48) 并将其中一个附加到不同的列表、下行链路中,以便进一步处理。我希望从列表上行链路中删除此重复项。

我的尝试如下:

        for u in uplink:
            A = u[0]
            B = u[1]
            if (A,B) == (B,A):
                downlink.append(u)
                uplink.remove(u)

这不起作用。任何帮助将不胜感激。谢谢。

1 个答案:

答案 0 :(得分:2)

您可以利用 Counter 和 freezesets:

>>> x = [(6, 26), (15, 26), (26, 48), (26, 65), (48, 26), (48, 92), (65, 26), (65, 92), (88, 26), (9
2, 48), (92, 65)]
>>> from collections import Counter
>>> count = Counter(frozenset({first, second}) for first, second in x)
>>> count
Counter({frozenset({48, 26}): 2, frozenset({65, 26}): 2, frozenset({48, 92}): 2, frozenset({65, 92})
: 2, frozenset({26, 6}): 1, frozenset({26, 15}): 1, frozenset({88, 26}): 1})
>>> [(first, second) for (first, second), count in count.items() if count > 1]
[(48, 26), (65, 26), (48, 92), (65, 92)]

编辑

修复了上述答案中的一个错误,假设输入到frozenset 的第一项和第二项可以相等(为简单起见,我只是过滤了这些)。

from random import randint
import timeit
from collections import Counter

x = [(randint(1, 20), randint(1, 20)) for i in range(100)]

def frozen_set_counter():
    global x
    count = Counter(frozenset({first, second}) for first, second in x if first != second)
    return [(first, second) for (first, second), count in count.items() if count > 1]

def min_max_counter():
    global x
    count = Counter(((min(first, second), max(first, second)) for first, second in x if first != second))
    return [(first, second) for (first, second), count in count.items() if count > 1]

print(timeit.timeit(lambda: frozen_set_counter(), number=10000))
print(timeit.timeit(lambda: min_max_counter(), number=10000))

0.48902677999999994
0.687653337