总结字典

时间:2017-11-19 11:02:01

标签: python dictionary

我有一个大小为stat的字典3 x 5(三个变量abc,每个变量有五个数据点)并得到以下输出使用print(stat)

defaultdict(<class 'dict'>,{
    datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30}, 
    datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24}, 
    datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54}, 
    datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81}, 
    datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
})

我还有一个列表art,其中存储了变量:art = ['a', 'b', 'c']

目前我通过这样计算总和:

j=0
for key in stat:
    for k in range(5):
        data[k][j] = stat[key][art[k]]
    j=j+1

sum_per_var = np.sum(data, axis=1)

得到:

sum_per_var = [ 106.  175.  249.]

但这种做法看起来真的很笨拙。是否有更简洁的方法来计算变量abc的总和?

1 个答案:

答案 0 :(得分:2)

在纯Python中,你可以使用sum的列表理解和生成器理解:

[sum(d[key] for d in stat.values()) for key in art]

举个例子:

import datetime
stat = {
    datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30}, 
    datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24}, 
    datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54}, 
    datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81}, 
    datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
}
art = ['a', 'b', 'c']
[sum(d[key] for d in stat.values()) for key in art]
# [106, 175, 249]

但是熊猫可能更容易也更简洁:

import datetime
import pandas as pd
stat = {
    datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30}, 
    datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24}, 
    datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54}, 
    datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81}, 
    datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
}
pd.DataFrame(stat).T
#              a   b   c
# 2017-11-03  18  82  30
# 2017-11-04  14  10  24
# 2017-11-05  14  61  54
# 2017-11-06  32  10  81
# 2017-11-07  28  12  60
pd.DataFrame(stat).T.sum()
# a    106
# b    175
# c    249