删除python中的重复元素

时间:2015-03-10 03:16:39

标签: python

对于给定的列表,例如:

[
    'A',
    [
        'B',
        ['D','C'],
        ['C','D']
    ],
    ['C','D'],
    ['E','F'],
    ['F','E'],
    [
        ['not','M'],
        'N',
        ['not','M']
    ]
]

我想删除其中的重复元素,上面列表的结果应为:

[
    'A',
    [
        'B',
        ['C','D']
    ],
    ['C','D'],
    ['E','F'],
    [
        ['not','M'],
        'N'
    ]
]

它有两个规则:['not','A']代表~A,它可以看作一个元素。 如果值相同但订单不相同,我们认为它是相同的。所以['C','D']['D','C']相同 任何人都可以帮我在python中编写这个函数来实现这个要求吗?

1 个答案:

答案 0 :(得分:1)

总的来说,我同意Klaus D.的评论。但我认为这个问题很有意思,因为我能想到的最简单的方法是使用list-> tuples-> sets,当你想要删除元素时,它会变得有趣。它还会导致您丢失原始输入列表的顺序(如Adam Smith所述)。

所以考虑到这一点,请考虑:

import itertools

def _reduce(lst):
    if not isinstance(lst, list): return lst

    seen = []
    for x in lst:
        if not any(list(perm) in seen for perm in itertools.permutations(x)):
            seen.append(_reduce(x))
    return seen

您可以通过以下方式运行:

lst = [
    'A',
    [
        'B',
        ['D','C'],
        ['C','D']
    ],
    ['C','D'],
    ['E','F'],
    ['F','E'],
    [
        ['not','M'],
        'N',
        ['not','M']
    ]
]
print _reduce(lst)

哪个输出:

[
    'A', 
    [
        'B', 
        ['D', 'C']
    ], 
    ['C', 'D'], 
    ['E', 'F'], 
    [
        ['not', 'M'], 
        'N'
    ]
]

请注意,这会保留列表元素的输入顺序。另请注意,因此,此输出与预期输出略有不同(保留['D', 'C']并丢弃['C', 'D']。)


编辑根据你的意见,itertools.permutations()是不够的,因为你似乎想要一些递归函数来考虑子元素的排列。怎么样:

import itertools

def _permutations(x):
    if not isinstance(x, list): return x

    perms = []
    for prod in itertools.product(*[_permutations(elem) for elem in x]):
        for perm in itertools.permutations(prod):
            perms.append(list(perm))

    return perms

def _reduce(lst):
    if not isinstance(lst, list): return lst

    seen = []
    for x in lst:
        if not any(list(perm) in seen for perm in _permutations(x)):
            seen.append(_reduce(x))
    return seen

def lexsort(x): return sorted(str(e) for e in _permutations(x))

arrs =  [
    ['B',['C','D']],
    [['D','C'],'B'],
]

print _reduce(arrs)

输出:

[['B', ['C', 'D']]]