尝试计算百分比并使用Pandas添加新列

时间:2017-02-12 21:33:46

标签: python pandas

我有一个使用groupby创建的pandas数据帧,返回结果如下:

          loan_type
type            
risky      23150
safe       99457

我想创建一个名为pct的列,并将其添加到我执行此操作的数据框中:

total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)

结果就是这样:

       loan_type  pct
type                 
risky      23150  NaN
safe       99457  NaN

此时我不知道我需要做些什么才能让下面的代码看到我创建整个事情的代码:

import numpy as np
bad_loans = np.array(club['bad_loans'])

for index, row in enumerate(bad_loans):
    if row == 0:
        bad_loans[index] = 1
    else:
        bad_loans[index] = -1

loans = pd.DataFrame({'loan_type' : bad_loans})
loans['type'] = np.where(loans['loan_type'] == 1, 'safe', 'risky')loans = np.absolute(loans.groupby(['type']).agg({'loan_type': 'sum'}))
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)

1 个答案:

答案 0 :(得分:1)

有问题你想要除以值,而是一个值Series,因为没有对齐indexes得到NaN s。

我认为最简单的是将Series total转换为numpy array

total = loans.sum(numeric_only=True)
loans['pct'] = loans.loan_type / total.values

print (loans)
       loan_type       pct
type                      
risky      23150  0.188815
safe       99457  0.811185

或者通过索引[0]转换选择 - 输出是数字:

total = loans.sum(numeric_only=True)[0]
loans['pct'] = loans.loan_type / total

print (loans)
       loan_type       pct
type                      
risky      23150  0.188815
safe       99457  0.811185