我想添加一个基于列'mths_since_recent_revol_delinq'的列,如果mths_since_recent_revol_delinq为空,则获取新列等于1,并获取新的数据框,如:
+----+--------------------------------+------------------------------------+
| | mths_since_recent_revol_delinq | mths_since_recent_revol_delinq_add |
+----+--------------------------------+------------------------------------+
| 0 | NaN | 1 |
| 1 | 33 | 0 |
| 2 | NaN | 1 |
| 3 | NaN | 1 |
| 4 | 57 | 0 |
| 5 | 21 | 0 |
| 6 | 60 | 0 |
| 7 | NaN | 1 |
| 8 | 2 | 0 |
| 9 | 24 | 0 |
| 10 | NaN | 1 |
+----+--------------------------------+------------------------------------+
def label_race (df):
if df['mths_since_recent_revol_delinq'].isnull():
return 1
else:
return 0
Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)
和回溯:
-------------------------------------------------- ---------------------------- AttributeError Traceback(最近一次调用 最后)在() ----> 1 Loan_a1 ['mths_since_recent_revol_delinq_add'] = Loan_a1.apply(lambda df:label_race(df),axis = 1)
D:\ Program文件 (x86)\ Anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py在apply(self, func,axis,broadcast,raw,reduce,args,** kwds)4150
如果reduce为None:4151 reduce = True -> 4152返回self._apply_standard(f,轴,reduce = reduce)4153其他:4154
返回self._apply_broadcast(f,轴)D:\ Program文件 (x86)\ Anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py在 _apply_standard(self,func,axis,ignore_failures,reduce)4246 try:4247 for i,v in enumerate(series_gen): -> 4248个结果[i] = func(v)4249 keys.append(v.name)4250,例外是e:
在(df)中 ----> 1 Loan_a1 ['mths_since_recent_revol_delinq_add'] = Loan_a1.apply(lambda df:label_race(df),axis = 1) label_race(df)中的 1个def label_race(df): ----> 2如果df ['mths_since_recent_revol_delinq']。isnull(): 3返回1 其他4个: 5返回0
AttributeError :(““ float”对象没有属性“ isnull””,发生 在索引0')
有关如何解决它的任何想法?谢谢
答案 0 :(得分:1)
使用isnull,然后使用astype将结果强制转换为int:
Loan_a1 = pd.DataFrame({'mths_since_recent_revol_delinq': [np.nan, 33.0, np.nan, np.nan, 57.0, 21.0, 60.0, np.nan, 2.0, 24.0, np.nan]})
results_key = "mths_since_recent_revol_delinq_add"
input_key = "mths_since_recent_revol_delinq"
Loan_a1[results_key] = Loan_a1[input_key].isnull().astype(int)
print (Loan_a1)
mths_since_recent_revol_delinq mths_since_recent_revol_delinq_add
0 NaN 1
1 33.0 0
2 NaN 1
3 NaN 1
4 57.0 0
5 21.0 0
6 60.0 0
7 NaN 1
8 2.0 0
9 24.0 0
10 NaN 1
答案 1 :(得分:0)
我认为这可以通过最少的更改解决您的问题:
def label_race (df):
if pd.isna(df['mths_since_recent_revol_delinq']):
return 1
else:
return 0
Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df:
label_race(df),axis=1)
答案 2 :(得分:-1)
按照@Jezrael的结构和定义以及巧妙的技巧,True * 1 = 1
和False * 1 = 0
可以使用(也可以赋值)获得相同的结果:
Loan_a1.assign(results_key = lambda x:x[input_key].isnull() * 1)
请注意,此方法直接返回新的数据帧。无需进一步分配