我通过以下方式配置了数据框:
date_string | type | amount
2015-01-01 | a | 500
2015-01-01 | b | 300
2015-01-01 | c | 200
2015-01-02 | a | 400
2015-01-02 | b | 600
2015-01-02 | c | 100
我希望添加一个新列,并按日期细分总百分比,并得到如下所示:
date_string | type | amount | percent
2015-01-01 | a | 500 | 0.5
2015-01-01 | b | 300 | 0.3
2015-01-01 | c | 200 | 0.2
2015-01-02 | a | 300 | 0.3
2015-01-02 | b | 600 | 0.6
2015-01-02 | c | 100 | 0.1
答案 0 :(得分:4)
将GroupBy.transform
与sum
一起使用,将原始列除以Series.div
:
df['percent'] = df['amount'].div(df.groupby('date_string')['amount'].transform('sum'))
print (df)
date_string type amount percent
0 2015-01-01 a 500 0.500000
1 2015-01-01 b 300 0.300000
2 2015-01-01 c 200 0.200000
3 2015-01-02 a 400 0.363636
4 2015-01-02 b 600 0.545455
5 2015-01-02 c 100 0.090909
答案 1 :(得分:0)
这将起作用
import numpy as np
df.groupby(['amount', 'date_string']).sum().transform(lambda x: x/np.sum(x))