在字典的嵌套字典中计算平均值

时间:2019-08-01 14:37:04

标签: python dictionary

我有一本具有以下结构的字典;

d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
     'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
     'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}

我希望输出如下;

b = {'average': {'salary': {'year1': 43.3, 'year2': 58.3}, 'age': 24}}

因此内部dict可以包含都是数字或字典的值。如果是字典,我们保证每个组成字典都具有相同的键(即:对于每个years,相同的salary总是会出现在actor中)。

我没有问题找到age键的正确值,可以按照以下步骤进行操作;

actor_keys = list(d)
b = {}
b['average'] = {}
b['average']['age'] = np.mean([b[i]['age'] for i in actor_keys])

salary内的键上是否存在一种类似的计算方法?

5 个答案:

答案 0 :(得分:1)

您可以将递归用于更强大的解决方案,以处理未知深度的输入:

from itertools import groupby
data = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30}, 'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17}, 'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
def ave(d):
  _data = sorted([i for b in d for i in b.items()], key=lambda x:x[0])
  _d = [(a, [j for _, j in b]) for a, b in groupby(_data, key=lambda x:x[0])]
  return {a:ave(b) if isinstance(b[0], dict) else round(sum(b)/float(len(b)), 1) for a, b in _d}

result = {'average':ave(list(data.values()))}

输出:

{'average': {'age': 24.0, 'salary': {'year1': 43.3, 'year2': 58.3}}}

答案 1 :(得分:1)

这是另一种递归解决方案:

def average_dicts(dicts):
    result = {}
    for i, d in enumerate(dicts):
        for k, v in d.items():
            update_dict_average(result, k, v, i)
    return result

def update_dict_average(current, key, update, n):
    if isinstance(update, dict):
        subcurrent = current.setdefault(key, {})
        for subkey, subupdate in update.items():
            update_dict_average(subcurrent, subkey, subupdate, n)
    else:
        current[key] = (current.get(key, 0) * n + update) / (n + 1)

d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
     'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
     'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}

result = {'average': average_dicts(d.values())}
print(result)
# {'average': {'salary': {'year1': 43.333333333333336, 'year2': 58.333333333333336}, 'age': 24.0}}

答案 2 :(得分:1)

这就是我要做的。

def avg(nums):
    nums = list(nums)
    return round(sum(nums) / len(nums), 1)

d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
     'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
     'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}

average = {'salary': {}}
average['age'] = avg(actor['age'] for actor in d.values())
for year in list(d.values())[0]['salary']:
    average['salary'][year] = avg(actor['salary'][year] for actor in d.values())

b = {'average': average}
>>> print(b)
{'average': {'salary': {'year1': 43.3, 'year2': 58.3}, 'age': 24.0}}

这可以处理任意正数的年份和演员,不需要itertoolsnumpy

答案 3 :(得分:1)

功能方法:

import itertools
from statistics import mean

d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
     'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
     'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}

#helpers
age = operator.itemgetter('age')
salary = operator.itemgetter('salary')
year = operator.itemgetter(0)
value = operator.itemgetter(1)

ages = map(age,d.values())
avg_age = mean(ages)
print(f'avg_age: {avg_age}')

salaries = map(dict.items, map(salary, d.values()))
salaries = sorted(itertools.chain.from_iterable(salaries), key=year)
for key, group in itertools.groupby(salaries, year):
    avg = mean(map(value, group))
    print(f'avg for {key}: {avg}')

答案 4 :(得分:0)

这是我的解决方案,可重复使用您年龄的操作:

b = {}
b['average'] = {}
b['average']["salary"] = {"year1":np.mean([d.get(i).get('salary').get('year1') for i in d]),"year2":np.mean([d.get(i).get('salary').get('year2') for i in d])}