Question

我一直在使用成人普查数据集，网址为： https://archive.ics.uci.edu/ml/datasets/census+income

对于我阅读的内容，存在一些标有“？”的缺失值。我正在建立一个分类器，所以我想用模式替换那些值，但是我发现了一些问题。我的源代码如下，我对遇到的问题发表评论：

将熊猫作为pd导入从sklearn导入预处理将numpy导入为np

def open(fileR):
    head=["gt lt 50","age","workclass","fnlwgt","edu","edu-num","mar-sta","occ","rela","race","sex","cap-gain","cap-loss","country","hpw"]
    f=pd.read_csv(fileR,sep=',')
    f.columns=head
    f.replace('?',np.nan)   #I want to replace the ? values with nan 
    f = f.fillna(f.mode().iloc[:,1])        #replace the nan values with the mode
    print (f.iloc[:,1])

但是我得到的值仍然是？符号，例如：

25                 Private
26                       ?
27                 Private
28                 Private
29               Local-gov

我要更改所有？使用该模式从f数据帧的分类变量中获取值，我是否缺少一些步骤？

PD。

我还尝试了以下方法来仅检查一列：

    f.replace('?',np.nan,inplace=True)
    f = f.fillna(f.mode().iloc[:,1])
    print (f.iloc[:,1])

但是它仍然打印？值。

谢谢

用熊猫处理缺失的分类价值？

0 个答案: