在Python中对字典列表中的数据进行排序和组合

时间:2012-12-13 17:14:04

标签: python list sorting dictionary

我有一个与此类似的词典列表:

{'Catch': 4.414, 'ShipID': 173, 'Name': u'Sigur\xf0ur \xd3lafsson SF - 44', 'Gear': u'BOTN'}
{'Catch': 2.401, 'ShipID': 173, 'Name': u'Sigur\xf0ur \xd3lafsson SF - 44', 'Gear': u'BOTN'}
{'Catch': 67.463, 'ShipID': 1275, 'Name': u'J\xf3n V\xeddal\xedn VE - 82', 'Gear': u'BOTN'}
{'Catch': 51.803, 'ShipID': 1275, 'Name': u'J\xf3n V\xeddal\xedn VE - 82', 'Gear': u'BOTN'}
{'Catch': 7.539, 'ShipID': 1595, 'Name': u'Fr\xe1r VE - 78', 'Gear': u'BOTN'}
{'Catch': 97.984, 'ShipID': 1903, 'Name': u'\xdeorsteinn \xdeH - 360', 'Gear': u'BOTN'}
{'Catch': 94.796, 'ShipID': 1903, 'Name': u'\xdeorsteinn \xdeH - 360', 'Gear': u'BOTN'}
{'Catch': 61.347, 'ShipID': 2020, 'Name': u'Su\xf0urey VE - 12', 'Gear': u'BOTN'}
{'Catch': 21.135, 'ShipID': 2401, 'Name': u'\xde\xf3runn Sveinsd\xf3ttir VE - 401', 'Gear': u'BOTN'}
{'Catch': 16.151, 'ShipID': 2444, 'Name': u'Vestmannaey VE - 444', 'Gear': u'BOTN'}
{'Catch': 41.213, 'ShipID': 2677, 'Name': u'Bergur VE - 44', 'Gear': u'BOTN'}
{'Catch': 5.046, 'ShipID': 2403, 'Name': u'Hvanney SF - 51', 'Gear': u'NET'}
{'Catch': 2.311, 'ShipID': 2403, 'Name': u'Hvanney SF - 51', 'Gear': u'NET'}
{'Catch': 6.304, 'ShipID': 2403, 'Name': u'Hvanney SF - 51', 'Gear': u'NET'}
{'Catch': 4.231, 'ShipID': 2732, 'Name': u'Skinney SF - 20', 'Gear': u'NET'}
{'Catch': 6.46, 'ShipID': 2732, 'Name': u'Skinney SF - 20', 'Gear': u'NET'}
...

此列表已预先排序:

list_sorted = sorted(landingList, key=lambda d:(d['Gear'], d['ShipID']))

这是冰岛船只上岸的清单以及我想要做的,是通过'Gears'拆分列表并添加捕获量以显示总捕获量。与此相似

'BOTN'列表如下:

{'Catch': 6.815, 'ShipID': 173, 'Name': u'Sigur\xf0ur \xd3lafsson SF - 44', 'Gear': u'BOTN'}
{'Catch': 119.266, 'ShipID': 1275, 'Name': u'J\xf3n V\xeddal\xedn VE - 82', 'Gear': u'BOTN'}
{'Catch': 7.539, 'ShipID': 1595, 'Name': u'Fr\xe1r VE - 78', 'Gear': u'BOTN'}
{'Catch': 192.78, 'ShipID': 1903, 'Name': u'\xdeorsteinn \xdeH - 360', 'Gear': u'BOTN'}
{'Catch': 61.347, 'ShipID': 2020, 'Name': u'Su\xf0urey VE - 12', 'Gear': u'BOTN'}
{'Catch': 21.135, 'ShipID': 2401, 'Name': u'\xde\xf3runn Sveinsd\xf3ttir VE - 401', 'Gear': u'BOTN'}
{'Catch': 16.151, 'ShipID': 2444, 'Name': u'Vestmannaey VE - 444', 'Gear': u'BOTN'}
{'Catch': 41.213, 'ShipID': 2677, 'Name': u'Bergur VE - 44', 'Gear': u'BOTN'}

然后'NET'列表将是这样的:

{'Catch': 13.661, 'ShipID': 2403, 'Name': u'Hvanney SF - 51', 'Gear': u'NET'}
{'Catch': 10.691, 'ShipID': 2732, 'Name': u'Skinney SF - 20', 'Gear': u'NET'}

当然还有更多的齿轮和着陆。但这只是为了演示列表的外观和我的任务。你能帮我解决这个问题吗?

2 个答案:

答案 0 :(得分:2)

您可以使用itertools.groupby() function

from itertools import groupby
from operator import itemgetter

for gear, group in groupby(list_sorted, key=itemgetter('Gear')):
    # group is now an iterator, loop over it to get all items with the same value for Gear.
    # gear is the value of this group's "Gear" key.

答案 1 :(得分:0)

@Martijn指定使用itertools.groupby,但实施不会直截了当。以下是我能够如何实现这一目标

from itertools import groupby, tee
from operator import itemgetter
from copy import deepcopy
#Create a Wrapper named combine to wrap the entire functionality
def combine(data):

    #Which has a nested function to calculate the subtotal
    #For each Gear and Name Pair
    def subtotal(data):
            #Now Iterate through the grouped Gear Items
        for k1, v1 in data.iteritems():
            #And save the current Gear
            gear = k1
            #For Each person grouped catch
            for k2, v2 in v1:
                #Save the name
                name = k2
                #Create two iterator from one
                #We need this as we don't want
                #to lose the iterator after we
                #make an entire pass to calculate
                #the total catch
                v2_1, v2_2 = tee(v2)
                #Now calculate the total catch for each person
                tot_catch = sum(e['Catch'] for e in v2_1)
                #and select the first element from the group
                v2_2 = next(v2_2)
                #and update the catch with the total catch under
                #his belt and discard the rest
                v2_2['Catch'] = catch
                #finlly yield a tuple of the subtotal row
                #along with the gear and fisherman's name
                yield (gear, name, v2_2)
    #First create a copy of the data as we would be changing it in-place
    data = deepcopy(data)
    #First Group by the Gear
    data_groupby_Gear = groupby(data, key = itemgetter('Gear'))
    #For each Gear group by the fisherman
    data_groupby_Gear_then_Name = {k: groupby(list(v), itemgetter('Name'))
                                   for k,v in data_groupby_Gear}
    #and return the combined result
    return subtotal(data_groupby_Gear_then_Name)

#Just iterate through the combined data and print
#in any way you desire
for gear, name, sumtotal in combine(data):
    print u"{:10} {:20} {}".format(gear, name, sumtotal)

最终输出如下所示,但显示可以根据需要重新格式化

NET        Hvanney SF - 51      {'Catch': 13.661000000000001, 'ShipID': 2403, 'Name': u'Hvanney SF - 51', 'Gear': u'NET'}
NET        Skinney SF - 20      {'Catch': 10.690999999999999, 'ShipID': 2732, 'Name': u'Skinney SF - 20', 'Gear': u'NET'}
BOTN       Sigurður Ólafsson SF - 44 {'Catch': 6.8149999999999995, 'ShipID': 173, 'Name': u'Sigur\xf0ur \xd3lafsson SF - 44', 'Gear': u'BOTN'}
BOTN       Jón Vídalín VE - 82  {'Catch': 119.26599999999999, 'ShipID': 1275, 'Name': u'J\xf3n V\xeddal\xedn VE - 82', 'Gear': u'BOTN'}
BOTN       Frár VE - 78         {'Catch': 7.539, 'ShipID': 1595, 'Name': u'Fr\xe1r VE - 78', 'Gear': u'BOTN'}
BOTN       Þorsteinn ÞH - 360   {'Catch': 192.78, 'ShipID': 1903, 'Name': u'\xdeorsteinn \xdeH - 360', 'Gear': u'BOTN'}
BOTN       Suðurey VE - 12      {'Catch': 61.347, 'ShipID': 2020, 'Name': u'Su\xf0urey VE - 12', 'Gear': u'BOTN'}
BOTN       Þórunn Sveinsdóttir VE - 401 {'Catch': 21.135, 'ShipID': 2401, 'Name': u'\xde\xf3runn Sveinsd\xf3ttir VE - 401', 'Gear': u'BOTN'}
BOTN       Vestmannaey VE - 444 {'Catch': 16.151, 'ShipID': 2444, 'Name': u'Vestmannaey VE - 444', 'Gear': u'BOTN'}
BOTN       Bergur VE - 44       {'Catch': 41.213, 'ShipID': 2677, 'Name': u'Bergur VE - 44', 'Gear': u'BOTN'}