我有一个数据框,并在其上成功使用了dropna(),如下所示:
proc_train.isnull().any()
id False
perc_premium_paid_by_cash_credit False
age_in_days False
Income False
Count_3-6_months_late False
Count_6-12_months_late False
Count_more_than_12_months_late False
application_underwriting_score False
no_of_premiums_paid False
premium False
renewal False
sourcing_channel_B False
sourcing_channel_C False
sourcing_channel_D False
sourcing_channel_E False
Urban/Rural False
prem_to_inc_ratio False
late36_612 False
late36_12more False
late612_12more False
perc_times_prem False
然后我尝试选择要用作输入变量的数据:
X_train = proc_train.loc[:, proc_train.columns != 'renewal']
X_train = X.loc[:, X.columns != 'id']
但是它随后将所有空值都返回:
X_train.isnull().any()
perc_premium_paid_by_cash_credit False
age_in_days False
Income False
Count_3-6_months_late True
Count_6-12_months_late True
Count_more_than_12_months_late True
application_underwriting_score True
no_of_premiums_paid False
premium False
sourcing_channel_B False
sourcing_channel_C False
sourcing_channel_D False
sourcing_channel_E False
Urban/Rural False
prem_to_inc_ratio False
late36_612 True
late36_12more True
late612_12more True
perc_times_prem False
为什么会发生这种情况,哪种更好的方式运行它?
答案 0 :(得分:0)
本节:
X_train = X.loc[:, X.columns != 'id']
应该是
X_train = X_train.loc[:, X_train.columns != 'id']
为isnull()。any()产生与以前相同的全假结果。