如何在Python中减少/聚合每个多个键的dicts列表?

时间:2014-06-18 13:31:50

标签: python dictionary

我有一个像这样的词典列表:

sales_per_store_per_day = [
   {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
   {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
   {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
   {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]

如何减少此列表以获得每个商店的产品总数,忽略日期?上述输入的结果将是:

sales_per_store = [
   {'store':'a', 'product1':40, 'product2':8, 'product3':32},
   {'store':'b', 'product1':60, 'product2':10, 'product3':34}
]

2 个答案:

答案 0 :(得分:6)

使用collections.defaultdict()跟踪每家商店的信息,使用collections.Counter()来简化数字的总结:

from collections import defaultdict, Counter

by_store = defaultdict(Counter)

for info in sales_per_store_per_day:
    counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
    by_store[info['store']] += counts

sales_per_store = [dict(v, store=k) for k, v in by_store.items()]

counts是根据Counter()字典中的每个产品构建的info个实例;我假设除storedate键之外的所有内容都是产品计数。它使用dict理解来生成一个删除了这两个键的副本。 by_store[info['store']]查找给定商店的当前总计数(默认为新的空Counter()对象)。

然后最后一行产生你想要的输出;带有'store'和每个产品计数的新词典,但您可能只想保留从商店到Counter对象的字典映射。

演示:

>>> from collections import defaultdict, Counter
>>> sales_per_store_per_day = [
...    {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
...    {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
...    {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
...    {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
... ]
>>> by_store = defaultdict(Counter)
>>> for info in sales_per_store_per_day:
...     counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
...     by_store[info['store']] += counts
... 
>>> [dict(v, store=k) for k, v in by_store.items()]
[{'store': 'a', 'product3': 32, 'product2': 8, 'product1': 40}, {'store': 'b', 'product3': 34, 'product2': 10, 'product1': 60}]

答案 1 :(得分:3)

没有collections的版本 - 初学者可能更具可读性。

sales_per_store_per_day = [
   {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
   {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
   {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
   {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]

results = {}

for x in sales_per_store_per_day:

    # default value
    if x['store'] not in results:
        results[x['store']] = {'store': x['store'], 'product1':0, 'product2':0, 'product3':0}

    results[x['store']]['product1'] += x['product1']
    results[x['store']]['product2'] += x['product2']
    results[x['store']]['product3'] += x['product3']

print results

sales_per_store = results.values()

print sales_per_store

# results
{
  'a': {'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8}, 
  'b': {'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
}

# sales_per_store
[
  {'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8}, 
  {'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
]