通过两个元素的集合合并两个数组

时间:2013-01-03 18:00:36

标签: python arrays

我有一个包含偶数个整数的数组。该数组表示标识符和计数的配对。元组已经按标识符排序。我想将这些数组中的一些合并在一起。我想到了几种方法,但它们相当复杂,我觉得用python可能有一种简单的方法。

IE:

[<id>, <count>, <id>, <count>]

输入:

[14, 1, 16, 4, 153, 21]
[14, 2, 16, 3, 18, 9]

输出:

[14, 3, 16, 7, 18, 9, 153, 21]

4 个答案:

答案 0 :(得分:8)

最好将它们存储为字典而不是列表(不仅仅是为了这个目的,而是针对其他用例,例如提取单个ID的值):

x1 = [14, 1, 16, 4, 153, 21]
x2 = [14, 2, 16, 3, 18, 9]

# turn into dictionaries (could write a function to convert)
d1 = dict([(x1[i], x1[i + 1]) for i in range(0, len(x1), 2)])
d2 = dict([(x2[i], x2[i + 1]) for i in range(0, len(x2), 2)])

print d1
# {16: 4, 153: 21, 14: 1}

之后,您可以使用this question中的任何解决方案将它们添加到一起。例如(取自the first answer):

import collections

def d_sum(a, b):
    d = collections.defaultdict(int, a)
    for k, v in b.items():
        d[k] += v
    return dict(d)

print d_sum(d1, d2)
# {16: 7, 153: 21, 18: 9, 14: 3}

答案 1 :(得分:5)

使用collections.Counter

import itertools
import collections

def grouper(n, iterable, fillvalue=None):
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

count1 = collections.Counter(dict(grouper(2, lst1)))
count2 = collections.Counter(dict(grouper(2, lst2)))
result = count1 + count2

我在这里使用itertools library grouper recipe将您的数据转换为字典,但正如其他答案向您展示的那样,有更多方法可以为特定的猫提供皮肤。

resultCounter,每个ID都指向一个总计数:

Counter({153: 21, 18: 9, 16: 7, 14: 3})

Counter是多套的,可以轻松跟踪每个密钥的数量。对您的数据来说,这感觉就像是一个更好的数据结构。例如,它们支持求和,如上所述。

答案 2 :(得分:5)

您需要

collections.Counter()

In [21]: lis1=[14, 1, 16, 4, 153, 21]

In [22]: lis2=[14, 2, 16, 3, 18, 9]

In [23]: from collections import Counter

In [24]: dic1=Counter(dict(zip(lis1[0::2],lis1[1::2])))

In [25]: dic2=Counter(dict(zip(lis2[0::2],lis2[1::2])))

In [26]: dic1+dic2
Out[26]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

或:

In [51]: it1=iter(lis1)

In [52]: it2=iter(lis2)

In [53]: dic1=Counter(dict((next(it1),next(it1)) for _ in xrange(len(lis1)/2))) 
In [54]: dic2=Counter(dict((next(it2),next(it2)) for _ in xrange(len(lis2)/2))) 
In [55]: dic1+dic2
Out[55]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

答案 3 :(得分:0)

所有以前的答案看起来都不错,但我认为JSON blob应该正确地形成以开始或者(根据我的经验)它可能会在调试期间导致一些严重的问题等。在这种情况下使用id并计为字段,JSON应该看起来像

[{"id":1, "count":10}, {"id":2, "count":10}, {"id":1, "count":5}, ...]

正确形成这样的JSON更容易处理,并且可能类似于你所进入的。

这个类有点笼统,但肯定是可扩展的


from itertools import groupby
class ListOfDicts():
    def init_(self, listofD=None):
        self.list = []
        if listofD is not None:
            self.list = listofD

    def key_total(self, group_by_key, aggregate_key):
        """ Aggregate a list of dicts by a specific key, and aggregation key"""
        out_dict = {}
        for k, g in groupby(self.list, key=lambda r: r[group_by_key]):
            print k
            total=0
            for record in g:
                print "   ", record
                total += record[aggregate_key]
            out_dict[k] = total
        return out_dict


if __name__ == "__main__":
    z = ListOfDicts([ {'id':1, 'count':2, 'junk':2}, 
                   {'id':1, 'count':4, 'junk':2},
                   {'id':1, 'count':6, 'junk':2},
                   {'id':2, 'count':2, 'junk':2}, 
                   {'id':2, 'count':3, 'junk':2},
                   {'id':2, 'count':3, 'junk':2},
                   {'id':3, 'count':10, 'junk':2},
                   ])

    totals = z.key_total("id", "count")
    print totals

哪个给出了

    def key_total(self, group_by_key, aggregate_key):
        """ Aggregate a list of dicts by a specific key, and aggregation key"""
        out_dict = {}
        for k, g in groupby(self.list, key=lambda r: r[group_by_key]):
            print k
            total=0
            for record in g:
                print "   ", record
                total += record[aggregate_key]
            out_dict[k] = total
        return out_dict


if __name__ == "__main__":
    z = ListOfDicts([ {'id':1, 'count':2, 'junk':2}, 
                   {'id':1, 'count':4, 'junk':2},
                   {'id':1, 'count':6, 'junk':2},
                   {'id':2, 'count':2, 'junk':2}, 
                   {'id':2, 'count':3, 'junk':2},
                   {'id':2, 'count':3, 'junk':2},
                   {'id':3, 'count':10, 'junk':2},
                   ])

    totals = z.key_total("id", "count")
    print totals