python删除重复项

时间:2011-03-09 19:58:41

标签: python list duplicates tuples

在数组中,我有以下元组:

  ('0000233/02', 50.0, None, None, None, None, 'Yes') 
  ('0000233/02', 200.0, None, None, None, None, 'Yes') 

如果我遍历列表,我怎样才能根据第一个元素消除重复?

6 个答案:

答案 0 :(得分:3)

使用第一个元素作为键将它们放入dict中。如果您在添加之前进行检查,那么您将获得带有该密钥的第一个项目,否则您将获得最后一个项目。

答案 1 :(得分:1)

先看看:http://docs.python.org/faq/programming.html#how-do-you-remove-duplicates-from-a-list

>>> l=[('0000233/02', 50.0, None, None, None, None, 'Yes'), ('0000233/02', 200.0, None, None, None, None, 'Yes') ]
>>> dic={}
>>> for i in l: dic[i[0]]=i
...   
>>> dic
{'0000233/02': ('0000233/02', 200.0, None, None, None, None, 'Yes')}
>>> list(dic.values())
[('0000233/02', 200.0, None, None, None, None, 'Yes')]

答案 2 :(得分:1)

临时解决方案:

def unique_elem0( iterable ):
    seen = set()
    seen_add = seen.add
    for element in iterable:
        key = element[0]
        if key not in seen:
            seen_add(key)
            yield element

print list(unique_elem0(lst))

itertools receipes”解决方案中的“复制代码:

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

from operator import itemgetter        
print list(unique_everseen(lst, key=itemgetter(0)))

答案 3 :(得分:1)

如果你的输入被排序(或者至少,所有重复的组合在一起)的一种稍微不同的方式是使用itertools.groupby:

import itertools, operator

def filter_duplicates(items):
    for key, group in itertools.groupby(items, operator.itemgetter(0)):
        yield next(group)

这将挑选每一堆重复项的第一项(按第一项分组)。这比基于set / dict的方法更有效,因为不需要额外的结构,并保留序列的顺序。然而,它确实依赖于批量重复 - 如果它们可以出现在流中的任何位置,请使用其他方法之一。

答案 4 :(得分:0)

快速方法:使用您想要用来比较的元素创建一个字典。

# This will leave the last tuple found with that 1st value in the dict:
d = {}
for t in tuples:
    d[t[0]] = t # or .set()

# This will leave the first tuple found, instead of the last:
d = {}
for t in tuples:
    d.setdefault(t[0], t) # setdefault sets the value if it's missing.

答案 5 :(得分:0)

如果你不关心第一个元素之后的元素顺序,这很容易:

>>> t1= ('0000233/02', 50.0, None, None, None, None, 'Yes')
>>> t2= ('0000233/02', 200.0, None, None, None, None, 'Yes')
>>> t1=(t1[0],)+tuple(set(t1[1:]))
>>> t2=(t2[0],)+tuple(set(t2[1:]))
>>> t1
('0000233/02', 50.0, None, 'Yes')
>>> t2
('0000233/02', 200.0, 'Yes', None)

如果你关心订单:

>>> t2= ('0000233/02', 200.0, None, None, None, None, 'Yes')
>>> nd=[]
>>> garbage=[nd.append(i) for i in t2 if not nd.count(i)]
>>> t2=tuple(nd)
>>> t2
('0000233/02', 200.0, None, 'Yes')