Question

我无法从 Sci-kit learn categorical Imputer 中获得所需的结果。

df_cat = df[["Suburb", "Address", "Type", "Method", "SellerG", "Date", "YearBuilt", "CouncilArea", "Regionname"]]

df_cat['Suburb'] = pd.to_numeric(df_cat['Suburb'], errors='coerce')

df_cat['Address'] = pd.to_numeric(df_cat['Address'], errors='coerce')

df_cat['Type'] = pd.to_numeric(df_cat['Type'], errors='coerce')

df_cat['Method'] = pd.to_numeric(df_cat['Method'], errors='coerce')

df_cat['SellerG'] = pd.to_numeric(df_cat['SellerG'], errors='coerce')

df_cat['YearBuilt'] = pd.to_numeric(df_cat['YearBuilt'], errors='coerce')

df_cat['CouncilArea'] = pd.to_numeric(df_cat['CouncilArea'], errors='coerce')

df_cat['Regionname'] = pd.to_numeric(df_cat['Regionname'], errors='coerce')

df_cat['Date'] = pd.to_numeric(df_cat['Date'], errors='coerce')

from sklearn.preprocessing import Imputer

imputer2 = Imputer(strategy="most_frequent")

imputer2.fit(df_cat)

imputer2.statistics_

结果

array([     nan,      nan,      nan,      nan, 1.00e+00,      nan,
       1.97e+03,      nan,      nan])

现有值被 NaN 取代。这与我的要求完全相反。

无法使用 Sci-Kit 学习分类输入器

0 个答案: