我有以下计数器列表
[Counter({'A': 2, 'B': 2, 'C': 1}),
Counter({'A': 3, 'B': 3, 'C': 2}),
Counter({'A': 4, 'B': 4, 'C': 4}),
Counter({'A': 5, 'B': 4, 'C': 5}),
Counter({'A': 6, 'B': 6, 'C': 6}),
Counter({'A': 7, 'B': 8, 'C': 8}),
Counter({'A': 8, 'B': 9, 'C': 9}),
Counter({'A': 9, 'B': 12, 'C': 10}),
Counter({'A': 11, 'B': 14, 'C': 13}),
Counter({'A': 13, 'B': 17, 'C': 17}),
Counter({'A': 15, 'B': 19, 'C': 20}),
Counter({'A': 17, 'B': 22, 'C': 22}),
Counter({'A': 19, 'B': 24, 'C': 24}),
Counter({'A': 22, 'B': 26, 'C': 27}),
Counter({'A': 24, 'B': 29, 'C': 29}),
Counter({'A': 26, 'B': 30, 'C': 30}),
Counter({'A': 30, 'B': 33, 'C': 35}),
Counter({'A': 34, 'B': 35, 'C': 38}),
Counter({'A': 37, 'B': 40, 'C': 42}),
Counter({'A': 40, 'B': 42, 'C': 46})]
对于每个计数器,我想计算概率。我这样做了如下:
counts = ({'A': 2, 'B': 2, 'C': 1})
counts
total = sum(counts.values())
print(total)
probability_mass = {k:v/total for k,v in counts.items()}
print(probability_mass)
我不得不手动删除计数器,这根本不是pythonic。如何为每个计数器执行此类操作,即首先找到
的概率Counter({'A': 2, 'B': 2, 'C': 1})
然后是
Counter({'A': 3, 'B': 3, 'C': 2})
依此类推,然后从中制作一个DataFrame?
答案 0 :(得分:1)
由于pd.DataFrame()
知道如何处理字典列表,因此可以很容易地完成:
counter_list = [Counter({'A': 2, 'B': 2, 'C': 1}),
Counter({'A': 3, 'B': 3, 'C': 2}),
Counter({'A': 4, 'B': 4, 'C': 4}),
Counter({'A': 5, 'B': 4, 'C': 5}),
Counter({'A': 6, 'B': 6, 'C': 6}),
Counter({'A': 7, 'B': 8, 'C': 8}),
Counter({'A': 8, 'B': 9, 'C': 9}),
Counter({'A': 9, 'B': 12, 'C': 10}),
Counter({'A': 11, 'B': 14, 'C': 13}),
Counter({'A': 13, 'B': 17, 'C': 17}),
Counter({'A': 15, 'B': 19, 'C': 20}),
Counter({'A': 17, 'B': 22, 'C': 22}),
Counter({'A': 19, 'B': 24, 'C': 24}),
Counter({'A': 22, 'B': 26, 'C': 27}),
Counter({'A': 24, 'B': 29, 'C': 29}),
Counter({'A': 26, 'B': 30, 'C': 30}),
Counter({'A': 30, 'B': 33, 'C': 35}),
Counter({'A': 34, 'B': 35, 'C': 38}),
Counter({'A': 37, 'B': 40, 'C': 42}),
Counter({'A': 40, 'B': 42, 'C': 46})]
df = pd.DataFrame([{k: v / sum(counter.values()) for k, v in counter.items()}
for counter in counter_list])
print(df)
# A B C
# 0 0.400000 0.400000 0.200000
# 1 0.375000 0.375000 0.250000
# 2 0.333333 0.333333 0.333333
# 3 0.357143 0.285714 0.357143
# 4 0.333333 0.333333 0.333333
# 5 0.304348 0.347826 0.347826
# 6 0.307692 0.346154 0.346154
# 7 0.290323 0.387097 0.322581
# 8 0.289474 0.368421 0.342105
# 9 0.276596 0.361702 0.361702
# 10 0.277778 0.351852 0.370370
# 11 0.278689 0.360656 0.360656
# 12 0.283582 0.358209 0.358209
# 13 0.293333 0.346667 0.360000
# 14 0.292683 0.353659 0.353659
# 15 0.302326 0.348837 0.348837
# 16 0.306122 0.336735 0.357143
# 17 0.317757 0.327103 0.355140
# 18 0.310924 0.336134 0.352941
# 19 0.312500 0.328125 0.359375
答案 1 :(得分:1)
对我来说,我认为最好将您的Counter列表直接传递给df
df=pd.DataFrame(counter_list)
s=df.div(df.sum(1),0)
s
Out[1718]:
A B C
0 0.400000 0.400000 0.200000
1 0.375000 0.375000 0.250000
2 0.333333 0.333333 0.333333
3 0.357143 0.285714 0.357143
4 0.333333 0.333333 0.333333
.. ... ... ...
15 0.302326 0.348837 0.348837
16 0.306122 0.336735 0.357143
17 0.317757 0.327103 0.355140
18 0.310924 0.336134 0.352941
19 0.312500 0.328125 0.359375
[20 rows x 3 columns]