消除自定义数据结构的重复

时间:2014-07-08 18:40:31

标签: python

class md5Check(object):
    """docstring for md5Check"""
    def __init__(self, md5, fullpath):
        super(md5Check, self).__init__()
        self.fullpath = fullpath
        self.md5 = md5

    fullpath = ""
    md5 = ""

imageFiles = list()
temp = md5Check(md5Sum, fullpath)
imageFiles.append(temp)

我想删除列表中包含md5Check-Datastructure的Duplicates。类实例md5变量已知重复项。删除重复项的好方法是什么?

1 个答案:

答案 0 :(得分:1)

由于md5可以播放,因此您可以使用set来跟踪已显示的md5值。

seen = set()
imageFiles = [x for x in imageFiles if x.md5 not in seen and not seen.add(x.md5)]

如果您不喜欢副作用:

seen = set()
imageFiles_new = []
for x in imageFiles:
    if x.md5 not in seen:
        imageFiles_new.append(x)
        seen.add(x.md5)
imageFiles = imageFiles_new