尝试在我的熊猫数据框上创建一个名为loan_status_is_great的列。如果loan_status为“当前”或“已全额支付”,则它应包含整数1。否则它应该包含整数0。
我正在使用https://resources.lendingclub.com/LoanStats_2018Q4.csv.zip作为数据集。
我的问题代码是:
def loan_great():
if (df['loan_status']).any == 'Current' or (df['loan_status']).any == 'Fully Paid':
return 1
else:
return 0
df['loan_status_is_great']=df['loan_status'].apply(loan_great())
TypeError跟踪(最近一次通话) 在()中 ----> 1 df ['loan_status_is_great'] = df ['loan_status']。apply(loan_great())
/ apply中的/usr/local/lib/python3.6/dist-packages/pandas/core/series.py(self,func,convert_dtype,args,** kwds) 4043其他: 4044个值= self.astype(object).values -> 4045映射= lib.map_infer(值,f,convert = convert_dtype) 4046 4047 if len(mapped)and isinstance(mapped [0],Series):
pandas._libs.lib.map_infer()中的pandas / _libs / lib.pyx
TypeError:“ int”对象不可调用
答案 0 :(得分:0)
让我们尝试使用isin
来创建布尔序列并将其转换为整数的另一种方法:
df['loan_status'].isin(['Current','Fully Paid']).astype(int)
答案 1 :(得分:0)
我发现numpy where函数对于这些简单的列创建是一个不错的选择,同时保持了良好的速度。像下面这样的东西应该起作用:
import numpy as np
df['loan_status_is_great'] = np.where(df['loan_status']=='Current'|
df['loan_status']=='Fully Paid',
1,
0)