如何在特征重要性图中显示原始特征名称?

时间:2018-06-27 14:33:58

标签: python pandas xgboost

我创建了XGBoost模型,如下所示:

y = XY.DELAY_MIN
X = standardized_df

train_X, test_X, train_y, test_y = train_test_split(X.as_matrix(), y.as_matrix(), test_size=0.25)

my_imputer = preprocessing.Imputer()
train_X = my_imputer.fit_transform(train_X)
test_X = my_imputer.transform(test_X)

xgb_model = XGBRegressor()

# Add silent=True to avoid printing out updates with each cycle
xgb_model = XGBRegressor(n_estimators=1000, learning_rate=0.05)
xgb_model.fit(train_X, train_y, early_stopping_rounds=5, 
             eval_set=[(test_X, test_y)], verbose=False)

创建要素重要性图时,要素名称显示为“ f1”,“ f2”等。如何显示原始要素名称?

fig, ax = plt.subplots(figsize=(12,18))
xgb.plot_importance(xgb_model, max_num_features=30, height=0.8, ax=ax)
plt.show()

1 个答案:

答案 0 :(得分:2)

问题是public static void main(String args[])不会返回Imputer作为pd.DataFrame的输出,因此,当您这样做时,列名会丢失

transform()

简单的解决方案,将imputer输出包装到一个数据帧中,例如:

train_X = my_imputer.fit_transform(train_X)
test_X = my_imputer.transform(test_X)