在熊猫数据框中获取总数的百分比

时间:2018-09-20 14:28:36

标签: python pandas

我通过以下方式配置了数据框:

date_string | type | amount

 2015-01-01 |  a   | 500

 2015-01-01 |  b   | 300

 2015-01-01 |  c   | 200

 2015-01-02 |  a   | 400

 2015-01-02 |  b   | 600

 2015-01-02 |  c   | 100

我希望添加一个新列,并按日期细分总百分比,并得到如下所示:

date_string | type | amount | percent

 2015-01-01 |  a   | 500    | 0.5

 2015-01-01 |  b   | 300    | 0.3

 2015-01-01 |  c   | 200    | 0.2

 2015-01-02 |  a   | 300    | 0.3

 2015-01-02 |  b   | 600    | 0.6

 2015-01-02 |  c   | 100    | 0.1

2 个答案:

答案 0 :(得分:4)

GroupBy.transformsum一起使用,将原始列除以Series.div

df['percent'] = df['amount'].div(df.groupby('date_string')['amount'].transform('sum'))
print (df)
  date_string type  amount   percent
0  2015-01-01    a     500  0.500000
1  2015-01-01    b     300  0.300000
2  2015-01-01    c     200  0.200000
3  2015-01-02    a     400  0.363636
4  2015-01-02    b     600  0.545455
5  2015-01-02    c     100  0.090909

答案 1 :(得分:0)

这将起作用

import numpy as np
df.groupby(['amount', 'date_string']).sum().transform(lambda x: x/np.sum(x))