我想计算每一行的百分比。下面是一个示例数据框:
KEY DESCR counts
0 2 to A 1
1 2 to B 1
2 20 to C 1
3 35 to D 2
4 110 to E 4
5 110 to F 1
6 110 to G 1
百分比公式为:(计数/计数列上的总和。指标)* 100
示例:(1/2)* 100
以下是卡住的代码,因为我尝试了很多次但没有发生。
percentage = []
for i in range(len(df)):
percentage.append((df['counts'][i] / ...............) * 100)
df['PERCENTAGE'] = percentage
df
预期输出为:
KEY DESCR counts PERCENTAGE
0 2 to A 1 50
1 2 to B 1 50
2 20 to C 1 100
3 35 to A 2 100
4 110 to E 4 67
5 110 to C 1 16
6 110 to G 1 16
谁能帮我解决这个问题。谢谢
答案 0 :(得分:0)
如果性能很重要,则将GroupBy.transform
与sum
一起使用,并将原始列除以Series.div
,最后乘以Series.mul
:
df['PERCENTAGE'] = df['counts'].div(df.groupby('KEY')['counts'].transform('sum')).mul(100)
您可以按组划分每个值,但是如果较大的DataFrame或许多组效果不佳:
df['PERCENTAGE'] = df.groupby('KEY')['counts'].transform(lambda x: x / x.sum()).mul(100)
print (df)
KEY DESCR counts PERCENTAGE
0 2 to A 1 50.000000
1 2 to B 1 50.000000
2 20 to C 1 100.000000
3 35 to D 2 100.000000
4 110 to E 4 66.666667
5 110 to F 1 16.666667
6 110 to G 1 16.666667