熊猫:在另一列的基础上添加一列

时间:2018-07-03 14:38:43

标签: python pandas

我想添加一个基于列'mths_since_recent_revol_delinq'的列,如果mths_since_recent_revol_delinq为空,则获取新列等于1,并获取新的数据框,如:

+----+--------------------------------+------------------------------------+
|    | mths_since_recent_revol_delinq | mths_since_recent_revol_delinq_add |
+----+--------------------------------+------------------------------------+
|  0 | NaN                            |                                  1 |
|  1 | 33                             |                                  0 |
|  2 | NaN                            |                                  1 |
|  3 | NaN                            |                                  1 |
|  4 | 57                             |                                  0 |
|  5 | 21                             |                                  0 |
|  6 | 60                             |                                  0 |
|  7 | NaN                            |                                  1 |
|  8 | 2                              |                                  0 |
|  9 | 24                             |                                  0 |
| 10 | NaN                            |                                  1 |
+----+--------------------------------+------------------------------------+

def label_race (df):
   if df['mths_since_recent_revol_delinq'].isnull():
      return 1
   else:
      return 0

Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)

和回溯:

  

-------------------------------------------------- ---------------------------- AttributeError Traceback(最近一次调用   最后)在()   ----> 1 Loan_a1 ['mths_since_recent_revol_delinq_add'] = Loan_a1.apply(lambda df:label_race(df),axis = 1)

     

D:\ Program文件   (x86)\ Anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py在apply(self,   func,axis,broadcast,raw,reduce,args,** kwds)4150
  如果reduce为None:4151 reduce = True   -> 4152返回self._apply_standard(f,轴,reduce = reduce)4153其他:4154
  返回self._apply_broadcast(f,轴)

     

D:\ Program文件   (x86)\ Anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py在   _apply_standard(self,func,axis,ignore_failures,reduce)4246 try:4247 for i,v in enumerate(series_gen):   -> 4248个结果[i] = func(v)4249 keys.append(v.name)4250,例外是e:

     在(df)中   ----> 1 Loan_a1 ['mths_since_recent_revol_delinq_add'] = Loan_a1.apply(lambda df:label_race(df),axis = 1)

     label_race(df)中的

        1个def label_race(df):   ----> 2如果df ['mths_since_recent_revol_delinq']。isnull():         3返回1         其他4个:         5返回0

     

AttributeError :(““ float”对象没有属性“ isnull””,发生   在索引0')

有关如何解决它的任何想法?谢谢

3 个答案:

答案 0 :(得分:1)

使用isnull,然后使用astype将结果强制转换为int:

Loan_a1 = pd.DataFrame({'mths_since_recent_revol_delinq': [np.nan, 33.0, np.nan, np.nan, 57.0, 21.0, 60.0, np.nan, 2.0, 24.0, np.nan]})

results_key = "mths_since_recent_revol_delinq_add"
input_key = "mths_since_recent_revol_delinq"
Loan_a1[results_key] = Loan_a1[input_key].isnull().astype(int)
print (Loan_a1)
    mths_since_recent_revol_delinq  mths_since_recent_revol_delinq_add
0                              NaN                                   1
1                             33.0                                   0
2                              NaN                                   1
3                              NaN                                   1
4                             57.0                                   0
5                             21.0                                   0
6                             60.0                                   0
7                              NaN                                   1
8                              2.0                                   0
9                             24.0                                   0
10                             NaN                                   1

答案 1 :(得分:0)

我认为这可以通过最少的更改解决您的问题:

def label_race (df):
   if pd.isna(df['mths_since_recent_revol_delinq']):
      return 1
   else:
      return 0

Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: 
label_race(df),axis=1) 

答案 2 :(得分:-1)

按照@Jezrael的结构和定义以及巧妙的技巧,True * 1 = 1False * 1 = 0可以使用(也可以赋值)获得相同的结果:

Loan_a1.assign(results_key = lambda x:x[input_key].isnull() * 1)

请注意,此方法直接返回新的数据帧。无需进一步分配