XGBoost feature_importances_参数给出0值向量

时间:2018-03-07 10:03:52

标签: python pandas machine-learning statistics

我已经使用大型数据集[400000,93]对XGBClassifier进行了实验, 数据包含大量 NaN 值,我使用了 sklearn 包中的插补

imputer = Imputer()
imputed_x = imputer.fit_transform(data)
data = imputed_x

但功能重要值看起来像

    [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

出于这个原因,结果 -

precision: 1.0

recall: 1.0

accuracy: 1.0

traning_accuracy: 1.0

为什么模型不适合数据。

示例代码片段

model_xboost =XGBClassifier(max_depth=5,
                         n_estimators=100,)

#train
model_xboost.fit(train_data,train_labels)
print(model_xboost.feature_importances_)

0 个答案:

没有答案