Question

我有一个CSV文件，其中“性别”列中的某些字段缺失。所以我需要使用python使用fillna（）函数自动填充它们。我给出一个条件，如果ApplicantIncome大于20000，则必须使用“性别”列中的“男性”标签更新字段。相同代码粘贴在下面，还粘贴与该错误相关的错误。所以，任何人都可以帮我解决错误

如果data ['ApplicantIncome']> = 20000： data ['Gender']。fillna（data ['Gender'] =='Male'，inplace = True）

错误如下：

ValueError                                Traceback (most recent call last)
<ipython-input-97-19fee6c4a819> in <module>
----> 1 if data['ApplicantIncome'] >= 20000:
      2     data['Gender'].fillna(data['Gender'] == 'Male',inplace=True)

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1574         raise ValueError("The truth value of a {0} is ambiguous. "
   1575                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1576                          .format(self.__class__.__name__))
   1577 
   1578     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Answer 1

您可以mask您的系列：

df['Gender'] = df['Gender'].mask(df['Income'] >= 20000, df['Gender'].fillna('Male'))

Answer 2

使用np.where()：

import pandas as pd
import numpy as np
data['Gender']=np.where(data['ApplicantIncome']>= 20000,data['Gender'].fillna('Male'),data['Gender'])

用python中的值填充CSV文件的列

2 个答案: