Question

我有多个列表，如下所示

[u'a', 11, u'P']
[u'a', 11, u'A']
[u'b', 2, u'P']
[u'c', 1, u'P']
[u'c', 2, u'P']
[u'd', 1, u'P']
[u'e', 3, u'P']
[u'f', 2, u'P']
[u'a', 1, u'P']
[u'a', 2, u'P']
[u'b', 1, u'P']
[u'b', 11, u'P']

如何合并上面的列表，循环列表加起来如下

[u'a', 11, u'P'] + [u'a', 2, u'P'] + [u'a', 11, u'A'] = ['a',('P' : 13) ,('A': 11)]

[u'b', 2, u'P'] + [u'b', 1, u'P'] + [u'b', 11, u'P'] = ['b',14,p]

输出应如下所示：

['a',('P' : 13) ,('A': 11)]
['b',14,'p']

Answer 1

您可以考虑使用collections.defaultdict并迭代值中的dicts列表。

import collections
d = collections.defaultdict(list)
l = [[u'a', 11, u'P'],[u'a', 11, u'A'],[u'a', 3, u'P'],[u'b', 2, u'P'],[u'c', 1, u'P'],[u'c', 2, u'P'],[u'd', 1, u'P'],[u'e', 3, u'P']]
for k1, v, k2 in l:
    if k1 in d:
            d[k1].append({k2:v})
    else: 
        d[k1] = [{k2:v}]

newdict = {}
for key,value in d.items():
    newvalue = {}
    for valuedict in value:
        for key2,value2 in valuedict.items():
            if key2 in newvalue:
                newvalue[key2] += value2
            else:
                newvalue[key2] = value2
    newdict[key] = newvalue

print newdict

这会给你

{u'a': {u'A': 11, u'P': 14}, u'c': {u'P': 3}, u'b': {u'P': 2}, u'e': {u'P': 3}, u'd': {u'P': 1}}

Answer 2

由于两种情况之间的不一致，您想要的输出看起来有点奇怪。您可以通过简单地更改此示例来获得所需的输出，但是：

lists = [
 [u'a', 11, u'P'],
 [u'a', 11, u'A'],
 [u'b', 2, u'P'],
 [u'c', 1, u'P'],
 [u'c', 2, u'P'],
 [u'd', 1, u'P'],
 [u'e', 3, u'P'],
 [u'f', 2, u'P'],
 [u'a', 1, u'P'],
 [u'a', 2, u'P'],
 [u'b', 1, u'P'],
 [u'b', 11, u'P']]

# Each key in this dictionary will be one of the first elements
# from the lists shown above.  The values will be dictionaries
# mapping a letter (one of the third elements in each list) to
# their total count (i.e. the sum of the second elements matching
# the other two columns)
from collections import defaultdict
results = defaultdict(dict)

for main_key, count, subkey in lists:
    d = results[main_key]
    d[subkey] = d.get(subkey,0) + count

for main_key, values in results.items():
    print main_key, "=>", values

输出结果为：

a => {u'A': 11, u'P': 14}
c => {u'P': 3}
b => {u'P': 14}
e => {u'P': 3}
d => {u'P': 1}
f => {u'P': 2}

更新：感谢sharjeel建议在下面的评论中建议我使用setdefault删除defaultdict。

更新2：在下面评论中的进一步问题中，您指出您想要输出“[a]一组列表，如[[u'a', 11, u'P'], [u'a', 11, u'A']”。（我现在假设你的意思是列表而不是集合，但这几乎一样容易。）为了构建这样的列表列表，你可以用以下代码替换打印值的循环：

lists_output = []

for main_key, values in results.items():
    for subkey, count in values.items():
       lists_output.append([main_key,count,subkey])

print lists_output

...将提供输出：

[[u'a', 11, u'A'], [u'a', 14, u'P'], [u'c', 3, u'P'], [u'b', 14, u'P'], [u'e', 3, u'P'],
 [u'd', 1, u'P'], [u'f', 2, u'P']]

Answer 3

如果你使用来自itertools的groupby，它是一个单线解决方案。

将所有列表放在一个列表中说lst。

lst = [
    [u'a', 11, u'P']
    [u'a', 11, u'A']
    [u'b', 2, u'P']
    [u'c', 1, u'P']
    [u'c', 2, u'P']
    [u'd', 1, u'P']
    [u'e', 3, u'P']
    [u'f', 2, u'P']
    [u'a', 1, u'P']
    [u'a', 2, u'P']
    [u'b', 1, u'P']
    [u'b', 11, u'P']
]

现在在外部元素中使用group by对a，b，c等进行分组，然后在第三个元素上的每个分组数据组内，即P，A等。需要对进一步分组的数据求和。

以下是解决方案：

from itertools import groupby
result = dict(
                ( k, dict( (k1, sum([i[1] for i in g2])) for k1, g2 in groupby(g, key=lambda y: y[2] ) ) )
                for k, g in groupby(lst, key=lambda x: x[0])
            )

为了更好地理解，我建议你玩单个组然后跳转到嵌套组。

以下是一些链接：

http://docs.python.org/library/itertools.html#itertools.groupby

http://www.builderau.com.au/program/python/soa/Python-groupby-the-iterator-swiss-army-knife/0,2000064084,339280431,00.htm

如何在python中合并多个列表

3 个答案: