在计数器列表中总结

时间:2018-04-09 13:49:00

标签: python python-3.x list pandas counter

我有以下计数器列表

[Counter({'A': 2, 'B': 2, 'C': 1}),
Counter({'A': 3, 'B': 3, 'C': 2}),
Counter({'A': 4, 'B': 4, 'C': 4}),
Counter({'A': 5, 'B': 4, 'C': 5}),
Counter({'A': 6, 'B': 6, 'C': 6}),
Counter({'A': 7, 'B': 8, 'C': 8}),
Counter({'A': 8, 'B': 9, 'C': 9}),
Counter({'A': 9, 'B': 12, 'C': 10}),
Counter({'A': 11, 'B': 14, 'C': 13}),
Counter({'A': 13, 'B': 17, 'C': 17}),
Counter({'A': 15, 'B': 19, 'C': 20}),
Counter({'A': 17, 'B': 22, 'C': 22}),
Counter({'A': 19, 'B': 24, 'C': 24}),
Counter({'A': 22, 'B': 26, 'C': 27}),
Counter({'A': 24, 'B': 29, 'C': 29}),
Counter({'A': 26, 'B': 30, 'C': 30}),
Counter({'A': 30, 'B': 33, 'C': 35}),
Counter({'A': 34, 'B': 35, 'C': 38}),
Counter({'A': 37, 'B': 40, 'C': 42}),
Counter({'A': 40, 'B': 42, 'C': 46})]

对于每个计数器,我想计算概率。我这样做了如下:

counts = ({'A': 2, 'B': 2, 'C': 1})
counts
total = sum(counts.values())
print(total)
probability_mass = {k:v/total for k,v in counts.items()}
print(probability_mass)

我不得不手动删除计数器,这根本不是pythonic。如何为每个计数器执行此类操作,即首先找到

的概率
Counter({'A': 2, 'B': 2, 'C': 1})

然后是

Counter({'A': 3, 'B': 3, 'C': 2})

依此类推,然后从中制作一个DataFrame?

2 个答案:

答案 0 :(得分:1)

由于pd.DataFrame()知道如何处理字典列表,因此可以很容易地完成:

counter_list = [Counter({'A': 2, 'B': 2, 'C': 1}),
          Counter({'A': 3, 'B': 3, 'C': 2}),
          Counter({'A': 4, 'B': 4, 'C': 4}),
          Counter({'A': 5, 'B': 4, 'C': 5}),
          Counter({'A': 6, 'B': 6, 'C': 6}),
          Counter({'A': 7, 'B': 8, 'C': 8}),
          Counter({'A': 8, 'B': 9, 'C': 9}),
          Counter({'A': 9, 'B': 12, 'C': 10}),
          Counter({'A': 11, 'B': 14, 'C': 13}),
          Counter({'A': 13, 'B': 17, 'C': 17}),
          Counter({'A': 15, 'B': 19, 'C': 20}),
          Counter({'A': 17, 'B': 22, 'C': 22}),
          Counter({'A': 19, 'B': 24, 'C': 24}),
          Counter({'A': 22, 'B': 26, 'C': 27}),
          Counter({'A': 24, 'B': 29, 'C': 29}),
          Counter({'A': 26, 'B': 30, 'C': 30}),
          Counter({'A': 30, 'B': 33, 'C': 35}),
          Counter({'A': 34, 'B': 35, 'C': 38}),
          Counter({'A': 37, 'B': 40, 'C': 42}),
          Counter({'A': 40, 'B': 42, 'C': 46})]

df = pd.DataFrame([{k: v / sum(counter.values()) for k, v in counter.items()}
                   for counter in counter_list])
print(df)

#             A         B         C     
#  0   0.400000  0.400000  0.200000
#  1   0.375000  0.375000  0.250000
#  2   0.333333  0.333333  0.333333
#  3   0.357143  0.285714  0.357143
#  4   0.333333  0.333333  0.333333
#  5   0.304348  0.347826  0.347826
#  6   0.307692  0.346154  0.346154
#  7   0.290323  0.387097  0.322581
#  8   0.289474  0.368421  0.342105
#  9   0.276596  0.361702  0.361702
#  10  0.277778  0.351852  0.370370
#  11  0.278689  0.360656  0.360656
#  12  0.283582  0.358209  0.358209
#  13  0.293333  0.346667  0.360000
#  14  0.292683  0.353659  0.353659
#  15  0.302326  0.348837  0.348837
#  16  0.306122  0.336735  0.357143
#  17  0.317757  0.327103  0.355140
#  18  0.310924  0.336134  0.352941
#  19  0.312500  0.328125  0.359375

答案 1 :(得分:1)

对我来说,我认为最好将您的Counter列表直接传递给df

df=pd.DataFrame(counter_list)
s=df.div(df.sum(1),0)
s
Out[1718]: 
           A         B         C
0   0.400000  0.400000  0.200000
1   0.375000  0.375000  0.250000
2   0.333333  0.333333  0.333333
3   0.357143  0.285714  0.357143
4   0.333333  0.333333  0.333333
..       ...       ...       ...
15  0.302326  0.348837  0.348837
16  0.306122  0.336735  0.357143
17  0.317757  0.327103  0.355140
18  0.310924  0.336134  0.352941
19  0.312500  0.328125  0.359375
[20 rows x 3 columns]