用python中的值填充CSV文件的列

时间:2019-01-23 09:54:13

标签: python python-3.x pandas

我有一个CSV文件,其中“性别”列中的某些字段缺失。所以我需要使用python使用fillna()函数自动填充它们。我给出一个条件,如果ApplicantIncome大于20000,则必须使用“性别”列中的“男性”标签更新字段。相同代码粘贴在下面,还粘贴与该错误相关的错误。所以,任何人都可以帮我解决错误

  

如果data ['ApplicantIncome']> = 20000:       data ['Gender']。fillna(data ['Gender'] =='Male',inplace = True)

错误如下:

ValueError                                Traceback (most recent call last)
<ipython-input-97-19fee6c4a819> in <module>
----> 1 if data['ApplicantIncome'] >= 20000:
      2     data['Gender'].fillna(data['Gender'] == 'Male',inplace=True)

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1574         raise ValueError("The truth value of a {0} is ambiguous. "
   1575                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1576                          .format(self.__class__.__name__))
   1577 
   1578     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

2 个答案:

答案 0 :(得分:1)

您可以mask您的系列:

df['Gender'] = df['Gender'].mask(df['Income'] >= 20000, df['Gender'].fillna('Male'))

答案 1 :(得分:0)

使用np.where()

import pandas as pd
import numpy as np
data['Gender']=np.where(data['ApplicantIncome']>= 20000,data['Gender'].fillna('Male'),data['Gender'])