我有一个大小为stat
的字典3 x 5
(三个变量a
,b
和c
,每个变量有五个数据点)并得到以下输出使用print(stat)
:
defaultdict(<class 'dict'>,{
datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30},
datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24},
datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54},
datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81},
datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
})
我还有一个列表art
,其中存储了变量:art = ['a', 'b', 'c']
目前我通过这样计算总和:
j=0
for key in stat:
for k in range(5):
data[k][j] = stat[key][art[k]]
j=j+1
sum_per_var = np.sum(data, axis=1)
得到:
sum_per_var = [ 106. 175. 249.]
但这种做法看起来真的很笨拙。是否有更简洁的方法来计算变量a
,b
和c
的总和?
答案 0 :(得分:2)
在纯Python中,你可以使用sum
的列表理解和生成器理解:
[sum(d[key] for d in stat.values()) for key in art]
举个例子:
import datetime
stat = {
datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30},
datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24},
datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54},
datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81},
datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
}
art = ['a', 'b', 'c']
[sum(d[key] for d in stat.values()) for key in art]
# [106, 175, 249]
但是熊猫可能更容易也更简洁:
import datetime
import pandas as pd
stat = {
datetime.datetime(2017, 11, 3, 0, 0): {'a': 18, 'b': 82, 'c': 30},
datetime.datetime(2017, 11, 4, 0, 0): {'a': 14, 'b': 10, 'c': 24},
datetime.datetime(2017, 11, 5, 0, 0): {'a': 14, 'b': 61, 'c': 54},
datetime.datetime(2017, 11, 6, 0, 0): {'a': 32, 'b': 10, 'c': 81},
datetime.datetime(2017, 11, 7, 0, 0): {'a': 28, 'b': 12, 'c': 60}
}
pd.DataFrame(stat).T
# a b c
# 2017-11-03 18 82 30
# 2017-11-04 14 10 24
# 2017-11-05 14 61 54
# 2017-11-06 32 10 81
# 2017-11-07 28 12 60
pd.DataFrame(stat).T.sum()
# a 106
# b 175
# c 249