在元组列表中查找重复元素

时间:2017-10-27 11:43:31

标签: python

在元组列表中,如果第一个元素和最后一个元素与其他元组匹配,则添加第二个元素。

    p =[(u'basic', 7698, '01-2017'),
    (u'basic', 7685, '01-2017'),
    (u'Gross', 4875.0, u'01-2017'),
    (u'Gross', 4875.0, u'01-2017')]

输出应该像

    [(u'basic',15383,'01-2017'),(u'Gross', 9750.0, u'01-2017')]

我试图这样做

   o=[]        
   for i in p:
     if i[2] not in o:
        o.append(i[2])
     if i[0] not in o:
        o.append(i[0])
   count +=i[1]
   o.append(count)

我的o / p:

   ['01-2017', 'basic', u'Gross', 53050.0, 4875.0]

3 个答案:

答案 0 :(得分:1)

您可以使用defaultdict来处理此问题。使用元组的第一个和最后一个元素作为键,第二个元素作为值,通过加法累积:

from collections import defaultdict

l = [(u'basic', 7698, '01-2017'),
     (u'basic', 7685, '01-2017'),
     (u'Gross', 4875.0, u'01-2017'),
     (u'Gross', 4875.0, u'01-2017')]

d = defaultdict(int)
for t in l:
    d[(t[0], t[-1])] += t[1]

# create list of tuples from the defaultdict values
result = [(k[0], d[k], k[1]) for k in d]

>>> print(result)
[(u'basic', 15383, '01-2017'), (u'Gross', 9750.0, u'01-2017')]

答案 1 :(得分:0)

from collections import defaultdict

l = [(u'basic', 7698, '01-2017'),
     (u'basic', 7685, '01-2017'),
     (u'Gross', 4875.0, u'01-2017'),
     (u'Gross', 4875.0, u'01-2017'),
     (u'basic', 7685, '01-2017'),]

# make a list of tuples of 1st and 3rd elements
r = [(x, z) for x, y, z in l]

# this is based on
# https://stackoverflow.com/questions/6618515/sorting-list-based-on-values-from-another-list
r_sorted = [(y,x[1]) for (y, x) in sorted(zip(r, l), key=lambda pair: pair[0])]

# this is based on
# https://stackoverflow.com/questions/18194712/how-do-i-sum-tuples-in-a-list-where-the-first-value-is-the-same
# as per @Idles dublication alert
testDict = defaultdict(int)
for key, val in r_sorted:
    testDict[key] += val

print(testDict.items())

答案 2 :(得分:0)

您也可以使用itertools.groupby

from itertools import groupby

p =[(u'basic', 7698, '01-2017'),
    (u'basic', 7685, '01-2017'),
    (u'Gross', 4875.0, u'01-2017'),
    (u'Gross', 4875.0, u'01-2017')]

[(grp[0], sum(val[1] for val in vals), grp[1]) 
    for grp, vals in groupby(p, key=lambda x: (x[0], x[2]))]

# [('basic', 15383, '01-2017'), ('Gross', 9750.0, '01-2017')]